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Preface 


This  report  contains  the  edited  proceedings  of  a  symposium  on 
three-dimensional  displays  held  at  the  National  Academy  of  Sciences 
Building  in  Washington,  D.  C.  on  January  29,  1982.  The  meeting  was 
sponsored  jointly  by  the  National  Academy  of  Sciences -National  Research 
Council's  Committee  on  Human  Factors  and  the  Naval  Air  Systems  Command 
(Code  3100  • 

Technological  developments  in  recent  years  have  brought  us  to  this 
threshold  of  practical  three-dimensional  (3-D)  display  systems.  How¬ 
ever,  the  imminent  realization  of  such  systems  raises  a  number  of  per¬ 
ceptual,  human  factors,  and  operational  issues  that  must  be  answered 
before  these  displays  can  be  employed  to  best  advantage  in  applications. 
The  goals  of  this  symposium  were  (1)  to  determine  what  we  know  presently 
about  visual  perception  and  human  factors  of  3-D  displays,  (2)  to 
identify  critical  issues  requiring  research,  and  (3)  to  identify  and 
explore  some  of  the  likely  or  possible  areas  of  application,  particu¬ 
larly  with  regard  to  military  operational  needs. 

The  symposium  was  organized  in  three  parts,  corresponding  to  the 
three  goals  just  stated.  In  the  first  part,  five  researchers  described 
basic  research  findings  on  3-D  display  systems,  or  issues  related  to 
3-D  perception.  In  the  second  part,  four  panelists  involved  in  applied 
research  related  to  3-D  displays  discussed  the  topic:  "Critical  Research 
Issues  in  3-D  Displays."  Finally,  in  the  third  part  of  the  symposium, 
a  panel  of  three  military  program  managers  (a  fourth  was  unable  to 
attend  the  meeting)  discussed  the  topic:  "The  Applicability  of  3-D  Dis¬ 
play  Research  to  Military  Operational  Needs." 

The  successful  realization  of  this  symposium  was  possible  because 
of  the  efforts  of  several  people  whom  I  wish  to  acknowledge.  First,  the 
conception — and  much  of  the  early  planning — for  the  symposium  were  the 
result  of  the  enthusiastic  efforts  of  John  O'Hare  of  ONR.  I  thank  also 
Mildred  Webster  for  her  help  in  the  preparation  of  these  proceedings. 

My  deep  appreciation  and  thanks  go  to  Bob  Hennessy,  Study  Director  for 
the  NAS-NRC  Committee  on  Human  Factors  who  shared  over  the  months  the 
burden  of  organizing  the  conference.  Finally,  I  thank  the  symposium 
participants  for  their  contributions ,  all  of  which  resulted  in  an  engag¬ 
ing  and  valuable  meeting. 


David  J.  Getty 
Chairman 
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INTRODUCTION : 

THREE-DIMENSIONAL  DISPLAYS 


David  J.  Getty 
Bolt  Berar.ek  and  Newman  Inc. 
Cambridge,  Massachusetts 


The  research  discussed  in  this  symposium  focuses  on  visual  percep¬ 
tion  and  human  factors  relating  to  three-dimensional  displays.  In  this 
introduction,  I  consider  what  we  mean  by  a  three-dimensional  display  and 
discuss  a  distinction  between  two  classes:  stereo-pair  displays  and 
volumetric  displays. 

In  a  very  general  sense,  three-dimensional  displays  include  all 
systems  that  provide  sufficient  information  to  an  observer — in  whatever 
form — to  permit  the  relative  localization  of  displayed  objects  in  three- 
dimensional  space.  This  definition  is  too  broad  for  our  purposes.  For 
example,  it  would  include  flat  displays  that  present  depth  information 
to  an  observer  through  coding  techniques.  Illustrations  of  such  tech¬ 
niques  are  the  coding  of  depth  through  variations  in  the  brightness  or 
size  of  displayed  objects.  It  would  also  include  other  displays  that 
provide  depth  information  solely  through  monocular  cues.  There  are,  of 
course,  many  such  cues:  linear  perspective,  size  of  familiar  objects, 
interposition  of  objects,  shadows,  and  texture  gradients,  to  name  some 
of  the  more  important  ones.  We  use  these  cues  constantly  in  interpret¬ 
ing  the  relative  depth  of  objects  in  flat,  two-dimensional  displays  (e.g. 
photographs,  paintings,  movies  and  TV) .  These  are  complex  cues  that 
require  a  considerable  amount  of  cognitive  image  processing  in  their 
application.  Furthermore,  the  effectiveness  of  many  of  them  is  depen¬ 
dent  upon  familiarity  with  the  objects  being  displayed. 

All  of  the  types  of  display  described  above  have  in  common  that  the 
same  information — contained  in  a  flat,  two-dimensional  image — is  pre¬ 
sented  to  each  of  the  observer’s  two  eyes.  From  this  single  image,  the 
observer  extracts  available  monocular  and  coding  cues  to  depth  in  order 
to  form  a  perception  of  the  relative  depths  of  displayed  objects,  to 
whatever  degree  possible.  We  will  exclude  this  class  of  displays  from 
our  consideration. 

The  classes  of  displays  that  we  d£  wish  to  include  have  in  common 
that  different  information  is  presented  to  each  of  the  observer's  two 
eyes,  different  in  particular  in  that  binocular  d isparity  is  present. 
Binocular  disparity  is  also  referred  to  commonly  as  retinal  disparity. 
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binocular  parallax;  or  horizontal  parallax.  In  natural  viewing  of  a 
three-dimensional  scene,  binocular  disparity  is  a  straightforward  geo¬ 
metrical  consequence  of  the  fact  that  each  eye  is  viewing  the  scene  fran 
a  slightly  different  vantage  point.  Because  of  the  horizontal  separa¬ 
tion  of  our  eyes,  the  visual  angle  subtended  between  any  two  objects 
located  at  different  depths  will  necessarily  be  different  in  the  two 
eyes.  This  geometry  is  illustrated  in  the  diagram  below.  With  both 
eyes  fixated  on  object  F,  the  angle  subtended  from  the  fovea  by  another 
obiect  A,  located  at  a  different  depth  than  F,  is  larger  in  the  left  eye 
than  in  the  right.  The  difference  in  these  angles  is  a  measure  of  the 
amount  of  binocular  disparity. 


DISPARITY*  eL-  eR 


As  early  as  1838,  Charles  Wheatstone  had  demonstrated  that  binocu¬ 
lar  disparity  is  sufficient  to  yield  a  strong  perception  of  depth. 

Using  a  mirror  stereoscope,  which  he  invented,  he  drew  two  different 
pictures  of  a  solid  object,  representing  the  slightly  different  views  of 
th  object  as  would  be  seen  by  the  two  eyes  at  arm’s  length.  When  the 
images  were  viewed  in  the  stereoscope  with  each  image  channeled  to  the 
appropriate  eye,  he  found  that  the  object  appeared  in  depth  and  occu¬ 
pied  a  volume  of  space  just  as  did  its  real  counterpart.  The  perception 
of  depth  resulting  from  binocular  disparity  is  called  stereopsis,  and  is 
believed  to  be  the  single  most  potent  of  the  visual  cues  to  depth. 
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There  are  two  distinct  classes  of  stereopsis-based  displays.  The 
firsc  class,  which  we  may  call  stereo-pair  displays,  is  exemplified 
nicely  by  Wheatstone's  original  stereoscope.  For  these  displays,  a 
pair  of  images  is  constructed  containing  horizontal  disparity  appropri¬ 
ate  for  the  relative  depth  of  each  object  to  be  displayed.  The  image 
construction  process  may  be  as  simple  as  taking  two  photographs  of  a 
scene,  moving  the  camera  sideways  by  several  inches  between  pictures  to 
create  the  two  disparate  views,  or  as  complex  as  using  a  computer  to  do 
geometric  modeling  of  a  three-dimensional  scene  or  process.  Having 
constructed  a  pair  of  two-dimensional  images,  each  is  then  transmitted 
independently  to  the  appropriate  eye.  Since  Wheatstone's  invention  of 
the  mirror  stereoscope,  many  other  stereo-pair  displays  have  been 
developed,  differing  primarily  in  the  methodology  for  independent  deliv¬ 
ery  of  the  two  images,  one  to  each  of  the  two  eyes.  The  techniques 
employed  have  included  mirrors,  prisms,  crossed-polarizer  glasses,  red 
and  green  filter  glasses,  lenticular  screens,  and  alternating  shutter 
glasses.  Some  of  these  methods  are  discussed  in  mors  detail  by  Fox  in 
Chapter  1,  Piantanida  in  Chapter  3,  and  Uttal  et  al  in  Chapter  5. 

The  second  class  of  stereopsis-based  displays  may  be  called 
volumetric  or  space-filling  displays .  As  suggested  by  the  name,  these 
displays  are  based  on  a  single  real  or  virtual  image  which  quite  liter¬ 
ally  fills  a  three-dimensional  volume  of  space.  Over  the  past  40  years 
several  volumetric  displays  have  been  developed,  based  on  different 
techniques.  Examples  are  displays  produced  by  rapid  rotation  of  a  flat, 
dense  matrix  of  LEDS  through  a  volume,  holograms,  and  displays  produced 
by  oscillation  movement  of  a  flexible,  vari-focal  mirror.  Research 
using  a  particular  realization  of  this  last  technique  is  discussed  by 
Huggins  and  Getty  in  Chapter  2. 

While  both  stereo-pair  displays  and  volumetric  displays  are  simi lar 
in  that  they  activate  human  stereopsis,  they  differ  in  several  signifi¬ 
cant  ways.  These  differences,  listed  in  the  table  below,  have  strong 
implications  for  the  types  of  application  for  which  each  class  of  dis¬ 
play  is  suited. 


Stereopsis-Based  Displays 


Stereo-Pair  Displays 

.  Two  2-D  images 

.  Binocular  disparity  produced 
by  image  generation  process 

.  Depth  coordinates  are  not 
necessarily  required  for 
image  generation 

.  Single  point  of  view, 
controlled  by  display 


Volumetric  Displays 


.  One  space- filling  image 

.  Binocular  disparity  produced  by 
natural  separation  of  observer’s 
eyes 

.  Depth  coordinates  are  required 
for  image  generation 


.  Multiple  points  of  view, 
controlled  by  viewer 
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The  first  difference  concerns  binocular  disparity.  In  a  stereo¬ 
pair  display,  the  amount  of  binocular  disparity  present  is  determined 
by  the  process  used  to  generate  the  pair  of  images.  Whether  generated 
by  two  cameras  or  by  a  computer,  the  horizontal  separation  between  the 
two  points-of-view  is  arbitrary  and  can  be  varied  to  magnify  or  minify 
perceived  depth.  In  a  volumetric  display,  the  amount  of  binocular  dis¬ 
parity  present  is  fixed  by  the  natural  separation  between  the  observer's 
two  eyes.  The  ability  to  manipulate  and  exaggerate  depth  in  stereo¬ 
pair  displays  make  them  of  particular  interest  in  applications  where 
objects  and  background  are  difficult  to  discriminate  because  of  minimal 
actual  depth  differences,  or  camouflage  or  both. 

A  second  difference  concerns  knowledge  of  the  three-dimensional 
coordinates  of  each  object  to  be  displayed.  In  stereo-pair  displays, 
the  image  generc.tion  process  does  not  necessarily  require  any  explicit 
knowledge  about  the  locations  of  displayed  objects  in  three-dimensional 
space.  For  example,  a  stereo  pair  of  television  cameras  generates 
images  in  which  the  geometry  of  the  imaging  system  and  the  three-dimen¬ 
sional  scene  is  sufficient  to  place  each  object  at  the  appropriate 
location  within  each  image.  On  the  other  hand,  volumetric  displays 
require  the  three-dimensional  coordinates  of  each  object  to  be  displayed 
in  order  to  place  each  one  at  the  appropriate  location  within  the  dis¬ 
play  volume. 
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These  differences  between  stereo-pair  and  volumetric  displays  have 
significant  implications  for  applications.  Clearly,  stereo-pair  dis¬ 
plays  are  well-suited  to  situations  requiring  remote  viewing  of  natural 
three-dimensional  scenes.  These  scenes  can  be  reconstructed  in  depth 
for  the  observer  without  any  system  knowledge  of  object  location  or 
depth.  On  the  other  hand,  in  simulation  or  modeling  applications  where 
coordinate  information  is  available,  volumetric  displays  offer  a  dis¬ 
tinct  advantage  owing  to  the  following  difference.  For  a  given  static 
stereo-pair  of  images,  the  observer  receives  a  view  of  a  three-dimen¬ 
sional  scene  dictated  by  the  location  of  the  "eyes"  of  the  imaging 
system.  As  discussed  by  Rosinski  in  Chapter  4  the  observer  is  then 
constrained  to  view  the  stereo-pair  with  his  eyes  in  the  same  positions 
relative  to  the  images  or  suffer  visual  distortions  of  several  types. 
Furthermore,  for  systems  in  which  the  display  surfaces  are  not  held  in 
a  fixed  relationship  to  the  observer's  eyes,  as  in  stereo  TV,  the 
observer  is  constrained  to  keep  his  head  horizontal  to  match  the  direc¬ 
tion  of  binocular  disparity  in  the  images.  Volumetric  displays,  on  the 
other  hand,  permit  the  observer  to  freely  translate  or  rotate  his  head 
and  body  (within  the  viewing  limits  of  the  display) ,  obtaining  continu¬ 
ously  changing  perspectives  of  the  scene  correlated  with  movement,  just 
as  we  do  in  natural  viewing  of  the  real  world.  The  ability  to  look 
around  inside,  over,  and  under  displayed  objects  is  clearly  an  impor¬ 
tant  property  of  volumetric  displays  for  many  applications,  especially 
those  involving  complex  three-dimensional  shapes  or  spatial  relation¬ 
ships  among  many  objects. 
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This  paper  summarizes  some  of  the  conclusions  that  have  emerged 
from  a  program  of  research  that  my  colleagues  and  I  ’ave  underway  that 
is  concerned  broadly  with  the  processing  of  visual  information  from 
three-dimensional  displays.  One  specific  interest,  and  the  one  ex¬ 
plained  in  the  paper,  concerns  the  effect  of  depth  position  on  the 
interaction  among  stimuli.  It  will  be  helpful  to  begin  by  explaining 
what  is  meant  by  those  phrases,  depth  position,  and  interaction  among 
stimuli . 

I'm  using  stimulus  interaction  to  apply  collectively  to  a  wide 
range  of  visual  phenomena  that  have  in  common  changes  in  the  perceived 
attributes  of  a  stimulus  that  are  caused  by  the  contextual  stimulation 
surrounding  that  stimulus.  The  change  can  be  destructive  or  inhibi¬ 
tory  in  the  sense  that  the  perceptibility  of  the  stimulus  is  impaired. 
Or,  an  apparent  distortion  of  one  of  the  dimensions  of  the  stimulus 
can  occur.  For  example,  in  Figure  1  the  well-known  phenomenon  of 
simultaneous  contrast  is  illustrated.  Here  the  apparent  brightness 
of  the  inner  gray  circles,  which  have  the  same  objective  reflectance, 
is  altered  by  the  brightness  of  the  contextual  squares  in  which  they 
are  embedded.  In  Figure  2,  conditions  for  the  phenomenon  of  lateral 
interference  are  illustrated.  Lateral  interference,  which  is  also 
called  "crowding"  in  the  ophthalmic  literature,  refers  to  the  impaired 
perceptibility  of  a  form  when  it  is  surrounded  or  flanked  by  other 
forms,  relative  to  when  it  is  seen  as  isolation.  Finally,  Figure  3 
illustrates  the  stimulus  configuration,  which  is  sometimes  referred 
to  as  the  Ponzo  illusion,  that  produces  a  distortive  interaction  such 
that  the  apparent  length  of  the  parallel  lines  is  altered  by  the  lin¬ 
ear  perspective  cue  formed  by  the  railroad  tracks. 

These  examples  serve  to  define  what  is  meant  by  stimulus  inter¬ 
action.  Consider  now  depth  position.  This  refers  to  the  position  of 
the  interacting  stimulus  elements  along  the  Z  axis.  Almost  all  of  the 
considerable  research  on  various  kinds  of  stimulus  interaction  have 
dealt  only  with  the  two-dimensional  case,  where  the  X  and  Y  positions 
of  the  elements  are  varied  while  the  Z  axis  value  remains  constant. 

The  question  that  arises  naturally  from  a  consideration  of  three- 
dimensional  space  is  whether  stimulus  interactions  would  be  modified 
if  the  interacting  elements  occupied  different  perceived  depth  planes. 
At  the  most  general  theoretical  level  the  answer  bears  upon  which  of 
two  general  theories  of  visual  space  perception  is  more  nearly  cor¬ 
rect.  At  a  more  specific  level  it  bears  on  the  adequacy  of  models 
developed  for  specific  interactive  phenomena,  which,  in  general,  have 
invoked  the  assumption  that  perceived  depth  position  is  irrelevant  to 
the  occurrence  of  interactions.  Because  these  theoretical  issues  have 
been  treated  elsewhere,  there  is  no  compelling  need  to  consider  them 
today.  For,  independent  of  theory,  the  answer  to  the  question  of 
depth  separation  is  of  general  empirical  interest. 

Yet  only  a  small  number  of  experiments  have  explored  the  effect 
of  perceived  depth  position  on  stimulus  interaction  because  it  is 
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Figure  1.  An  illustration  of  simultaneous  contrast. 

Although  the  two  circles  are  physically 
equal  in  brightness,  the  circle  on  the 
dark  field  appears  brighter  than  the 


circle  on  the  light  field  (after  Dember 
§  Warm,  1979) . 
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LATERAL  INTERFERENCE 


Figure  2.  An  illustration  of  the  conditions  that 
produce  lateral  interference.  When  the 
figures  X,  B, -and  Z  are  embedded  within 
the  row  of  other  figures  they  are  more 
difficult  to  resolve  relative  to  when 
they  are  seen  in  isolation. 
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technically  quite  difficult  to  produce  changes  in  depth  without,  at 
the  same  time,  introducing  confounding  changes  in  proximal  stimula¬ 
tion.  Nevertheless,  it  has  been  possible  to  implement  successfully 
some  experiments.  In  this  regard,  the  research  program  of  Walter 
Gogel  (e.g.,  Gogel,  1978)  is  particularly  noteworthy.  In  his  research 
on  space  perception,  Gogel  has  been  led  to  develop  an  hypothesis  known 
as  the  adjacency  principle,  which  says,  in  effect,  that  the  interact¬ 
ion  among  stimuli  in  visual  space  is  inversely  related  to  the  X,  Y, 
and  Z  distance  separating  them.  To  test  the  adjacency  principle, 

Gogel  has  devised  various  optical  methods  that  produce  changes  in 
perceived  depth  through  the  manipulation  of  different  kinds  of  depth 
cues.  For  example,  in  one  experiment  Gogel  and  Newton  (1975)  used 
such  cues  to  produce  an  apparent  depth  separation  between  the  rod  and 
the  frame  that  comprise  the  well-known  rod  and  frame  illusion.  Fig¬ 
ure  4  illustrates  schematically  the  general  arrangement.  When  a  ver¬ 
tically  oriented  rod  is  enclosed  within  a  tilted  frame,  the  frame  acts 
to  induce  an  apparent  tilt  in  the  rod.  In  Gogel  and  Newton's  experi¬ 
ment,  when  both  rod  and  frame  were  in  the  same  depth  plane  the  expect¬ 
ed  tilt  of  the  rod  was  obtained.  But  when  the  rod  and  frame  were  in 
different  perceived  depth  planes,  such  that  the  rod  appeared  closer 
to  the  observer  than  the  frame,  the  effect  of  the  frame  was  signifi¬ 
cantly  reduced. 

Another  example  of  the  effect  of  perceived  depth  position  is  pro¬ 
vided  by  Allan  Gilchrist  (1977),  who  has  been  working  within  a  theo¬ 
retical  framework  different  from  that  of  Gogel.  Gilchrist's  basic 
experimental  situation  is  illustrated  in  Figure  5.  It  is  well  estab¬ 
lished  chat  the  perceived  brightness  of  a  particular  surface  is  con¬ 
trolled  by  the  relative  amounts  of  light  reflected  from  adjacent  sur¬ 
faces.  Gilchrist  used  the  depth  cue  of  interposition  to  make  the 
target,  which  remained  at  a  constant  physical  luminance,  appear  at 
a  far  depth  plane  adjacent  to  a  highly  illuminated  surface  or  at  a 
near  depth  plane  adjacent  to  a  more  dimly  illuminated  surface.  The 
perceived  change  in  depth  position  produced  a  large  change  in  the 
perceived  brightness  of  the  target.  As  shown  in  the  panel  at  the 
bottom  of  Figure  5,  the  target  appeared  quite  bright,  with  a  Munsell 
value  of  9  on  a  10-point  scale  when  seen  next  to  the  dimmer  surface 
at  the  near  position.  Yet  when  seen  against  a  brighter  surface  at 
the  far  position,  the  target  appeared  much  dimmer  with  a  Munsell  value 
of  3.5.  This  result  indicates  that  perceived  depth  plays  a  much  more 
significant  role  in  determining  the  final  percept  than  does  the  actu¬ 
al  luminances  that  are  impinging  on  the  retina. 

As  I  mentioned  earlier,  experiments  on  the  effect  of  depth  sepa¬ 
ration  are  difficult  to  implement  because  perceived  depth  must  be  in¬ 
duced  in  a  convincing  way  without,  at  the  same  time,  introducing  con¬ 
founding  changes  in  proximal  stimulation.  This  has  severely  restrict¬ 
ed  the  kinds  of  perceptual  actions  that  can  be  examined  and  the  magni¬ 
tude  of  the  variables  that  can  be  manipulated.  For  example,  in  the 
Gilchrist  situation,  it  is  not  possible  to  systematically  vary  the 
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Figure  4.  Displacement  in  depth  of  the  rod  relative  to  the  frame 
(from  Gogel  5  Newton,  1975). 
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perceived  depth  position  of  a  target  over  a  series  of  values. 

These  considerations  motivated  my  colleagues  and  me  to  develop  a 
more  flexible  method  of  pursuing  the  question  of  the  effect  of  per¬ 
ceived  depth  on  stimulus  interactions.  The  approach  we  have  taken 
employs  stimuli  constructed  from  random  element  stereograms.  As  shown 
in  Figure  6,  random  element  stereograms,  which  were  developed  by 
Julesz  (1960,  1971)  consist  of  large  arrays  of  randomly  organized  dots 
or  elements.  Neither  the  left-eye  view  nor  the  right-eye  view  contain 
any  recognizable  shape  or  contour.  But  retinal  disparity  that  induces 
stereoscopic  depth  can  be  introduced  by  displacing  a  subset  of  ele¬ 
ments  within  a  matrix  viewed  by  one  eye.  This  displacement  is  camou¬ 
flaged,  however,  by  the  myriad  of  surrounding  dots  and  cannot  be  seen. 
But  when  an  observer  who  possesses  stereopsis  views  the  left-  and 
right-eye  views  under  stereoscopic  conditions,  the  disparity  is  de¬ 
tected  by  the  visual  system  and  this  results  in  the  perception  of  a 
palpable,  clear-cut  stereoscopic  form  standing  out  in  depth.  These 
forms  originate  from  some  central  stage  of  the  visual  system  where 
inputs  from  both  eyes  are  combined  and,  in  that  sense,  they  bypass 
the  retina  or  other  peripheral  stages.  Nevertheless,  the  stereoscopic 
or  cyclopean  contours  have  been  shown  to  possess  many  of  the  function¬ 
al  characteristics  of  physical  contours.  That  is,  they  can  induce 
aftereffects,  eye  movements,  and  interact  in  much  the  same  manner  as 
their  physical  counterparts.  Our  approach  has  been  to  replicate  the 
interactions  that  occur  among  physical  stimuli  in  stereoscopic  space 
with  stereoscopic  stimuli.  This  allows  perceived  depth  to  be  changed 
very  easily  and  eliminates  entirely  the  problem  of  confounding  changes 
in  proximal  stimulation. 

Our  efforts  have  been  greatly  facilitated  by  the  development,  at 
Vanderbilt,  of  a  system  for  generating,  in  real  time,  dynamic  random 
element  stereograms.  Thv  major  components  of  the  system  are  illus¬ 
trated  in  Figure  7.  The  display  device  can  be  any  one  of  several  kinds 
of  slightly  modified  color  video  receivers.  By  directly  modulating  the 
red  and  green  electron  guns,  thousands  of  red  and  green  dots  are  con¬ 
tinuously  generated  many  times  a  second.  When  an  observer  views  the 
display  with  appropriate  red  and  green  filters  before  the  eyes,  the 
red  and  green  dots  are  physically  segregated  to  separate  eyes,  thereby 
fulfilling  the  conditions  of  stereoscopic  viewing  (i.e.,  the  well-known 
anaglyph  method  of  stereoscopic  presentation) .  All  parameters  of  the 
stereoscopic  displays,  such  as  depth  magnitude  and  direction,  are  con¬ 
trolled  by  a  hardwired  electronic  unit  composed  of  integrated  circuits. 
This  is  represented  by  the  box  marked  "Stereogram  Generator".  The  box 
marked  "Optical  Scanner"  consists  of  modified  TV  cameras  that  operate 
as  flying  spot  scanners.  ■  Any  two-dimensional  form  that  is  seen  or 
scanned  by  the  cameras  is  immediately  converted  into  its  stereoscopic 
equivalent.  Even  complex  shapes  undergoing  continuous  motion  can  be 
presented  as  stereoscopic  or  cyclopean  configurations.  This  system  is 
quite  flexible,  and  is  being  used  in  a  variety  of  investigations  con¬ 
cerned  with  perception  of  space  and  stereoscopic  depth  perception. 


Figure  $.  The  two  monocular  patterns  of  a  typical  static 

random- element  stereogram.  When  each  pattern  stimu¬ 
lates  a  separate  eye,  a  stereoscopic  form  can  be 
perceived  (after  Julesz,  1971). 
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Figure  7.  Display,  programming,  and  logic  units  of  the  stereogram 
generation  system. 
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With  the  aid  of  this  system  we  have  pursued  investigations  of  the 
effect  of  perceived  depth  position  on  those  stimulus  interactions  that 
involve  the  elevation  of  threshold,  and  those  that  involve  distortive 
interactions  at  the  suprathreshold  level.  In  all  cases  we  have  found 
that  perceived  depth  position  exerts  a  strong  effect  on  the  magnitude 
of  the  interaction.  In  one  study  my  colleague.  Bob  Patterson  and  I 
(Fox  and  Patterson,  1981b)  chose  the  Ponzo  illusion  as  an  example  of 
a  suprathreshold  distortive  interaction.  This  illusion  was  mentioned 
at  the  beginning  of  this  paper  and  shown  in  Figure  8.  We  investigated 
the  effect  of  the  perceived  depth  position  of  the  inducing  angle  on 
the  test  lines  using  stereoscopic  contours  formed  from  the  random  ele¬ 
ment  stereogram  display.  The  essentials  of  our  method  are  illustrated 
in  Figure  9.  The  stimuli  always  appeared  in  front  depth  (with  crossed 
disparity)  in  the  visual  space  between  the  display  and  the  observer. 
The  test  lines  remained  at  the  same  depth  position  while,  in  different 
experimental  conditions,  the  inducing  angle  was  located  at  perceived 
depth  positions  in  front  of,  and  behind  the  test  lines,  as  well  as  in 
a  depth  position  equal  to  that  of  the  test  lines.  The  perceived  width 
of  the  angle  was  kept  constant,  at  all  depth  positions,  to  compensate 
for  the  perceived  change  in  width  produced  by  size  constancy.  The 
magnitude  of  the  illusion,  which  is  the  perceived  difference  in  length 
between  the  top  and  bottom  lines,  was  assessed  by  the  method  of  mag¬ 
nitude  estimation. 

The  results  are  given  in  Figure  10.  When  both  angle  and  test 
lines  were  in  the  same  lepth  plane,  a  significant  illusion  was  ob¬ 
tained  relative  to  that  observed  under  the  control  condition,  where 
illusion  magnitude  is  measured  in  the  absence  of  the  inducing  angle. 
Yet,  for  the  positions  marked  "back",  which  refers  to  the  case  where 
the  inducing  angle  is  in  back  of  the  test  lines  and  closer  to  the  dis¬ 
play  screen,  that  is,  the  test  lines  are  in  front  of  the  angle,  illu¬ 
sion  magnitude  declined  significantly.  But  for  the  positions  marked 
"front",  which  refers  to  the  angle  being  in  front  of  the  test  lines, 
illusion  magnitude  did  not  decline.  These  data  indicate  that  per¬ 
ceived  depth  position  plays  a  significant  role  in  determining  illusion 
magnitude.  Moreover,  the  effect  is  asymmetrical  in  that  the  stimulus 
interaction  declines  only  when  the  target,  the  acted-upon  stimulus,  is 
in  a  depth  plane  in  front  of  the  inducing  stimulus  and  closer  in  space 
to  the  observer. 

This  same  asymmetry  was  also  found  in  an  investigation  of  lateral 
interference  (Fox  §  Patterson,  1981a).  Recall,  from  the  introduc¬ 
tion,  that  lateral  interference  refers  to  the  impaired  perceptibility 
of  a  stimulus  produced  by  spatially  adjacent  stimuli  that  are  contin¬ 
uously  present  in  the  visual  field.  To  investigate  lateral  inter¬ 
ference  we  used  stereoscopic  contours  as  stimuli.  Their  configura¬ 
tion,  and  the  general  experimental  arrangement,  is  shown  in  Figure 
11.  The  target  is  a  Landolt  C,  in  which  the  gap  position  can  be 
varied  on  a  random  basis  to  any  one  of  two  positions — 3:00  and  9:00 
o'clock.  Interference  is  produced  by  the  continuously  present  annulus 


Figure  9.  Stimulus  arrangement  showing  relative  depth  of  test 
lines  (in  front)  and  inducing  angle.  Note  that  the 
term  "inducing  angle”  is  synonymous  with  induction 
wedge  in  the  figure  (from  Fox  §  Patterson,  1981). 
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which  surrounds  the  target.  The  target  remained  in  the  same  perceived 
depth  position,  while  over  the  experimental  conditions  the  annulus  was 
located  in  front,  back,  and  equal  perceived  depth  positions  with  re¬ 
spect  to  the  target.  As  in  the  previous  experiment,  compensation  was 
made  for  perceived  changes  in  the  width  of  the  annulus  produced  by 
size  constancy.  To  assess  interference,  in  one  experiment,  we  ob¬ 
tained  forced-choice  recognition  thresholds  for  the  target  as  a  func¬ 
tion  of  perceived  depth  position  of  the  annulus.  The  target  was  brief¬ 
ly  presented,  for  durations  on  the  order  of  60-80  millsec,  and  observ¬ 
ers  were  required  to  make  forced-choice  judgments  about  gap  location, 
which  varied  randomly  over  trials.  In  a  separate  experiment,  inter¬ 
ference  was  assessed  by  presenting  continuously  the  target  and  obtain¬ 
ing  ratings  of  its  clarity,  as  a  function  of  the  perceived  depth  posi¬ 
tion  of  the  annulus. 
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The  results  for  the  recognition  threshold  experiment  are  given  in 
Figure  12.  The  pre-  and  post-session  control  conditions  refer  to 
thresholds  obtained  in  the  absence  of  the  annulus.  "Equal"  refers  to 
the  case  where  both  annulus  and  test  were  in  the  same  depth  plane. 

The  addition  of  the  annulus  reduced  recognition  performance  signifi¬ 
cantly.  The  back  depth  positions  refer  to  the  case  where  the  annulus 
is  positioned  in  back  of  the  test  line.  This  produced  a  significant 
increase  in  performance  relative  to  the  equal  depth  condition.  In 
the  front  depth  positions,  the  annulus  is  in  front  of  the  target. 

This  produced  a  significant  decrease  in  performance.  This  same  pat¬ 
tern  of  results  was  obtained  when  the  target  was  continuously  present 
and  ratings  of  its  clarity  were  made. 

As  Figure  12  indicates,  while  perceived  depth  position  exerts  a 
strong  effect  on  lateral  interference,  the  effect  is  asymmetrical. 
After  observing  this  asymmetry  in  a  number  of  experiments,  we  have 
dubbed  it  the  "front  effect".  That  is  because  the  stimulus  in  front 
of  its  partner  and  closest  *-0  the  observer  seems  to  have  a  perceptual 
advantage . 

What  might  be  the  origin  of  the  front  effect?  We  have  speculated 
that  it  might  represent  an  intrinsic  preference  of  the  visual  system 
for  the  closer  stimulus--not  unlike  the  dominance  of  figure  over 
ground.  This  preference  would  be  involved  whenever  there  are  spatial¬ 
ly  adjacent  stimuli,  i.e.,  close  to  one  another  in  the  X  and  Y  plane, 
which  could  compete  for  attention.  It  would  be  adaptive  to  give  the 
greatest  weight  to  the  closest  stimulus  since  it  would  demand  the  most 
immediate  action. 

Consistent  with  this  view,  interaction  among  stimuli  declines 
with  increasing  separation  in  the  X  and  Y  planes.  Further,  this  view 
would  suggest  that  the  asymmetry  of  the  front  effect  would  not  occur 
if  the  interacting  stimuli  were  not  present  simultaneously  in  the 
visual  field. 
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In  the  interactions  considered  so  far,  all  stimulus  elements  have 
been  present  simultaneously  in  the  field.  There  are,  however,  what 
might  be  termed  successive  interactive  phenomena  in  which  the  after¬ 
effects  of  stimulation  by  one  stimulus  alters  the  characteristics  of 
the  stimulus  subsequently  presented.  Yet  neither  stimulus  is  simul¬ 
taneously  present.  Therefore,  during  successive  interaction  the  front 
effect  should  not  occur. 

To  explore  that  possibility,  we  used  the  classic  waterfall  illu¬ 
sion,  or  motion  aftereffect,  as  the  successive  interactive  phenomenon 
(Fox,  Patterson,  §  Lehmkuhle,  1982).  The  motion  aftereffect  occurs 
when  one  views  for  some  seconds  contours  moving  continuously  in  one 
direction.  When  that  motion  stops,  the  stopped  contours  now  appear 
to  move  in  the  opposite  direction.  We  used  the  stereogram  generation 
system  to  generate  an  array  of  stereoscopic  moving  contours.  Obser¬ 
vers  viewed  the  array  for  90  seconds.  Then  the  motion  stopped  and 
the  stationary  contours  were  viewed.  The  moving  contours  remained  in 
one  perceived  depth  position,  and  the  stationary  contours  were  placed 
in  different  perceived  depth  positions.  A  fixation  stimulus  in  the 
depth  plane  of  the  moving  contours  kept  the  eyes  aligned  in  the  same 
depth  plane  for  all  experimental  conditions.  The  main  results  are 
illustrated  in  Figure  13.  The  motion  aftereffect  was  strongest  when 
both  test  and  inducing  stimuli  were  in  the  same  depth  plane.  But  as 
the  depth  separation  between  test  and  inducing  stimuli  increased,  the 
magnitude  of  the  aftereffect  decreased  significantly.  But  in  this 
instance,  the  decrease  is  symmetrical  in  that  both  front  and  back 
depth  positions  show  an  equivalent  decrease.  These  data  suggest, 
then,  that  the  asymmetry  of  the  front  effect  occurs  only  when  adja¬ 
cent  stimuli  are  simultaneously  present.  Nevertheless,  they  do  show 
that  depth  position  per  se  plays  a  significant  role  in  governing  this 
interaction. 

Indeed,  the  Z  axis  or  depth  position  of  stimuli  seem  to  be  an 
influential  variable  in  determining  the  ease  with  which  information 
can  be  processed  in  three-dimensional  displays.  This  is,  of  course, 
not  the  only  such  variable. 

I  would  like  to  briefly  describe  three  other  variables,  or  ef¬ 
fects,  that  my  colleagues  and  I  are  investigating  that  can  also  in¬ 
fluence  information  processing.  The  first  concerns  an  asymmetry  in 
the  perceptibility  of  stereoscopic  stimuli  between  the  top  half  and 
bottom  half  of  the  display  that  is  brought  about  by  the  geometry  of 
stereoscopic  space.  Consistent  with  a  conjecture  by  Helmholtz,  it 
has  been  recently  demonstrated  by  Nakayama  (1977)  and  by  Cogan  (1979) 
that  the  vertical  horopter,  which  is  that  line  or  axis  extending  ver¬ 
tically  in  space  where  binocular  targets  are  seen  as  single,  is  tilted 
away  from  the  observer.  Such  a  tilt  can  alter  the  relative  percepti¬ 
bility  of  stimuli  falling  in  the  upper  and  lower  visual  fields,  de¬ 
pending  upon  the  sign  or  depth  direction  of  the  stimuli.  Indeed, 

Jules z,  Breitmeyer,  and  Kropfl  (1976)  have  reported  such  an  asymmetry 
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DIFFERENCE  IN  APPARENT  DEPTH 
BETWEEN  TEST  AND  INDUCTION 
GRATINGS  (CM) 

Figure  13.  Aftereffect  duration  as  a  function  of  the 
separation  in  depth  of  the  test  and  in¬ 
duction  gratings  (from  Fox  §  Patterson, 

1982). 
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and  we  have  also  found  it.  We  have  also  found  that  the  inherent  asym¬ 
metry  of  stereoscopic  space  can  be  modified  by  physically  tilting  the 
display  (Fox  §  Patterson,  1981c) . 

Second,  the  perceived  size  of  stereoscopic  stimuli  change  accord¬ 
ing  to  their  perceived  depth  position.  As  I  mentioned  earlier,  this 
is  due  to  size  constancy  and  is  quite  analogous  to  changes  in  the  size 
of  afterimages  that  occur  as  projection  distance  varies.  We  have  some 
data  which  suggests  that  the  apparent  reduction  in  size  of  a  stereo¬ 
scopic  form  also  reduces  the  discriminabilitv  of  details  embedded  with¬ 
in  it. 

Finally,  the  perceived  depth  of  any  stereoscopic  form,  that  is, 
how  far  it  appears  to  stand  out  in  depth  in  terms  of  some  absolute 
metric  such  as  centimeters,  is  only  partially  determined  by  retinal 
disparity.  Due  to  depth  constancy  and  related  perceptual  processes, 
perceived  depth  can  be  strongly  influenced  by  the  perceived  distance 
between  observer  and  display,  and  by  the  presence  of  other  stimuli 
in  the  field  that  appear  at  different  depth  position. 

Information  on  these  phenomena  is  not  complete,  however,  and  re¬ 
search  continues.  As  the  data  accrue  they  will  contribute  to  the 
growing  base  of  information  that  can  be  used  to  optimize  the  design 
of  three-dimensional  displays.  Indeed,  at  present  that  data  base  is 
sufficient  to  support  the  formulation  of  at  least  some  general  recom¬ 
mendations  about  design.  For  instance,  research  on  depth  position 
would  seem  to  speak  directly  to  the  question  of  where,  in  depth,  crit¬ 
ical  signals  should  be  located. 

Yet  one  cautionary  note  should  be  registered  with  respect  to  the 
wholesale,  literal  application  of  that  information.  Almost  all  of  the 
research  on  three-dimensional  space  and  on  stereoscopic  depth  percept¬ 
ion  is  based  on  results  obtained  from  a  small  number  of  elite  obser¬ 
vers  who  have  received  extensive  training.  Relatively  little  is  known 
about  the  binocular  visual  capacities  of  the  general  population.  It 
is  known,  however,  that  individuals  vary  widely  in  their  visual  capac¬ 
ities  with  respect  to  space  perception,  and  that  training  can  have 
considerable  impact  on  these  capacities.  It  would  seem  worthwhile  to 
learn  more  about  the  abilities  of  the  normal  observer,  so  that  we 
don’t  build  displays  that  only  the  elitist  can  use. 


27 


-  5. 


» y-  imm  ft  ^ ;  -y  >  -Vft  rr.v  a 


f  s — ~  i  lW  ■MbTm  '' '  V  T 


«-  vsgg^jgfe, » ' 


i 

I 


Cogan,  A.  I.  (1979).  The  relationship  between  the  apparent  vertical 
and  the  vertical  horopter.  Vision  Research,  19,  655-665. 

Fox,  R.  5  Patterson,  R.  (1981a).  Depth  separation  and  lateral  inter¬ 
ference.  Perception  8  Psychophysics,  30,  513-520. 

Fox,  R.  6  Patterson,  R.  (1981b).  Effect  of  depth  separation  on  the 
Ponzo  illusion  (Tech.  Rep.  N14-1101  81C-0001).  Nashville,  TN: 

Vanderbilt  University,  Department  of  Psychology. 

Fox,  R.  6  Patterson,  R.  (1981c).  Effect  of  the  tilted  vertical  horop¬ 
ter  on  visual  recognition  (Tech.  Rep.  N14-1101  81C-0002) .  Nash¬ 

ville,  TN:  Vanderbilt  University,  Department  of  Psychology. 

Fox,  R. ,  Patterson,  R. ,  §  Lebmkuhle,  S.  (1982).  Effect  of  depth  posi¬ 
tion  on  the  motion  aftereffect.  Paper  presented  at  the  meeting  of 
the  Association  for  Research  in  Vision  and  Ophthalmology,  Sarasota. 

Gilchrist,  A.  (1977).  Perceived  lightness  depends  on  perceived  spa¬ 
tial  arrangement.  Science,  195,  185-187. 

Gogel,  W.C.  (1975).  Depth  adjacency  and  the  rod- frame  illusion. 
Perception  6  Psychophysics,  18,  163-171. 

Gogel,  W.C.  (1978).  The  adjacency  principle  in  visual  perception. 
Scientific  American,  238,  126-139. 

Julesz,  B.  (1960).  Binocular  depth  perception  of  computer- generated 
patterns.  The  Bell  System  Technical  Journal,  39,  1125-1162. 

Julesz,  B.  (1971).  Foundations  of  Cyclopean  Perception.  Chicago: 
University  of  Chicago  Press. 

Julesz,  B.,  Breitmeyer,  B.,  §  Kropfl,  W.  (1976).  Binocular-disparity- 
dependent  upper-lower  hemifield  anisotropy  and  left-right  hemifield 
isotrooy  as  revealed  by  dynamic  random  dot  stereograms.  Percept¬ 
ion,  5,  129-141. 

Nakayama,  K.  (1977).  Geometric  and  physiological  aspects  of  depth 
perception.  Proceedings  of  the  Society  of  Photo-Optical  Instru¬ 
mentation  Engineers ,  120,  2-9. 


28 


DISPLAY-CONTROL  COMPATIBILITY  IN  3-D  DISPLAYS 


A.  W.  F.  Huggins  and  D.  J.  Getty 
Bolt  Beranek  and  Newman  Inc. 

50  Moulton  Street,  Cambridge,  MA  02238 


ABSTRACT 

We  explore  some  problems  of  display-control  compatibility  that 
confront  the  human  operator  of  a  true  volumetric  3-D  display.  We 
measured  the  speed  and  accuracy  with  which  an  operator  can  make 
decisions  about  directions  in  displayed  object-space  (up,  down,  left, 
right..),  when  an  object  is  presented  in  unpredictable  orientations. 
Operators  employ  three  strategies  in  this  task.  In  order  of  decreasing 
speed  and  accuracy,  they  are:  (1)  a  spatial  strategy,  which  can  be 
applied  only  when  the  display  object  and  the  control  object  are  in  the 
same  orientation;  (2)  a  relational  strategy,  in  which  the  choice  is 
made  on  the  basi3  of  the  spatial  relationship  between  the  cued 
direction  and  the  orientation  cue  provided  in  the  icon;  and  (3)  a 
rotational  strategy,  in  which  the  operator  mentally  rotates  hi3  body 
position  so  as  to  make  the  orientation  of  the  displayed  object 
equivalent  to  that  of  the  control  object.  In  the  third  strategy, 
response  times  increase  progressively  with  the  amount  of  rotation 
required.  In  the  final  experiments,  wo  show  that  use  of  the  third 
strategy  can  be  avoided  by  appropriate  coding  of  the  display  icon. 
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Introduction 


The  purpose  of  the  work  described  is  to  explore  some  of  the 
difficulties  human  operators  are  likely  to  encounter  in  viewing  and 
using  abstract,  volumetric,  three-dimensional  displays  in  practical 
applications.  By  an  abstract  display,  we  mean  one  in  which  the  image 
is  constructed .  as  opposed  to  reproduced  veridically,  e.g.  a  TV  image. 
Stereoscopic  3-D  displays,  consisting  of  a  pair  of  2-D  images,  can  be 
either  abstract  or  veridical  in  this  sense.  Although  there  are  several 
areas  in  which  abstract  3-D  displays  offer  obvious  advantages,  either 
in  economy  or  clarity  of  the  displayed  data,  it  is  not  obvious  how 
various  types  of  data  should  best  be  displayed  in  order  to  minimize 
operator  errors  and  confusion.  Furthermore,  although  some  of  the 
relevant  questions  have  been  asked  before,  the  answers  provided  by 
earlier  research  have  almost  invariably  concerned  flat,  2-dimensional 
displays,  and  their  applicability  to  3-D  displays  is  not  always 
obvious. 

The  approach  taken  in  the  initial  experiments  described  here  has 
been  to  study  how  the  speed  and  accuracy  of  responses  in  a  choice 
reaction  task  are  degraded  as  the  orientation  of  the  stimulus  image 
(presented  in  a  true  volumetric  display)  is  varied  relative  to  that  of 
a  fixed  response  array. 

SpaceOraph 

The  display  used  in  the  studies  is  a  true  space-filling  display 
called  SpaceGraph.  It  differs  from  stereoscopic  displays  in  that  the 
image  viewed  by  the  observer  is  truly  three-dimensional:  the  luminous 
points  from  which  the  image  is  composed  actually  exist  at  different 
depths  from  the  observer.  This  contrasts  with  stereoscopic  displays, 
which  attempt  to  recreate  with  two  flat  displays  what  the  observer's 
left  and  right  eyes  would  see  i£  they  were  looking  at  a  three- 
dimensional  image. 

It  is  appropriate  to  describe  briefly  here  how  the  display  works, 
because  this  will  make  it  easier  to  understand  the  experiments 
described  below.  [Since  the  display  was  demonstrated  at  the 
conference,  anyone  interested  could  view  the  display  and  thus 
experience  at  first  hand  the  salience  of  the  3-D  percept,  which  made 
detailed  description  unnecessary. 3  It  is  also  important  to  stress,  for 
readers  of  this  report,  that  it  is  impossible  to  convey  the  immediacy 
and  the  conviction  of' the  3-D  image  except  by  viewing  the  live  image: 
flat  photographs  cf  the  images,  such  as  appear  in  this  paper  as 
illustrations,  are  highly  ambiguous  with  respect  to  depth,  because  they 
incorporate  none  of  the  cues  that  can  be  used  to  perceive  depth. 

Figure  1  shows  a  schematic  view  of  the  experimental  apparatus, 
including  SpaceGraph.  The  observer  views  the  face  of  a  CRT,  reflected 
in  a  circular  flexible  mirror.  The  mirror  is  mounted  on  the  front  of  a 
low-frequency  loudspeaker.  When  the  loudspeaker  is  excited  by  a  30  Hz 
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Figure  Is  Sketch  of  observer  with  response  box  and  SpaceGraph 
display.  The  observer  viewed  the  CRT  in  a  vibrating  mirror, 
which  generated  a  virtual  image  of  the  stimulus  cube  behind  the 
mirror. 


sine  wave,  the  mirror  flexes,  approximately  spherically,  cycling 
successively  through  flat,  concave,  flat,  and  convex  shapes  30  times 
per  second.  As  it  does  so,  the  virtual  image  of  the  face  of  the  CRT, 
which  appears  to  the  observer  to  be  behind  the  mirror,  sweeps 
cyclically  through  a  depth  of  about  30  cm.  If  a  point  on  the  face  of 
the  CRT  is  momentarily  brightened,  at  the  same  instant  in  every  mirror 
cycle,  the  observer  will  see  a  luminous  point  suspended  at  a  specific 
depth  in  the  dark  void  behind  the  mirror.  The  depth  of  the  point  can 
be  varied  by  changing  the  instant  in  the  mirror  cycle  at  which  the  CRT 
beam  is  unblanked.  Thus  the  depth  dimension  is  specified  by  the  time 
in  each  mirror  cycle  that  a  point  is  displayed.  The  lateral  and 
vertical  positions  of  the  point  correspond  to  where  on  the  face  of  the 
CRT  the  bright  point  is  produced. 

Images  are  built  of  points  and  linear  arrays  of  points.  In  the 
prototype  model  on  which  we  did  our  experiments,  about  5000  points  are 
available  for  drawing  an  image,  corresponding  to  500  cm  of  lines  at  10 
points  per  cm.  This  is  sufficient  for  a  fairly  complex  image. 
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Properties  of  SpaoeGraph  images.  SpaceGraph  images  have  several 
properties  that  make  them  unique.  First,  since  the  points  comprising 
the  image  are  truly  at  different  depths  from  the  observer,  the  image 
shows  perspective  distortion  identical  with  that  of  a  physical  3-D 
object.  The  binocular  parallax  effect,  and  the  movement  parallax 
effects  that  occur  when  the  observer  moves  his/her  head,  are  not 
simulated,  they  are  real.  In  this  respect,  the  image  has  some  of  the 
properties  of  a  hologram.  The  observer  can  move  his  head  laterally  or 
vertically,  or  rotate  it  about  the  line  of  sight,  and  indeed  enhances 
the  3-D  percept  by  doing  so.  The  amount  of  movement  possible  is 
constrained  only  by  the  requirement  that  the  viewer  not  lose  sight  of 
the  CRT  face  reflected  in  the  mirror. 


'  Second,  it  is  not  possible  to  hide  things  in  the  image.  Since  the 

image  consists  of  bright  points  floating  in  a  dark  void,  displayed 
objects  are  transparent.  It  is  not  obvious  whether  transparency  is  an 
advantage  or  a  disadvantage.  If  a  large  object  is  being  maneuvered 
l  towards  a  smaller  target,  it  may  be  helpful  to  be  able  to  see  the 

i  target  through  the  interposed  object  being  moved.  On  the  other  hand, 

I  if  the  background  can  be  seen  through  the  figure,  this  may  make  it  more 

difficult  to  see  the  figure.  These  problems  have  not  been  studied 
l  previously,  because  volumetric  displays,  in  which  they  become  an  issue, 

l  have  not  been  available  before  SpaceGraph. 
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The  third  property  (not  unique  to  SpaceGraph)  reflects  the 
abstractness  of  the  displayed  data:  all  the  data  must  be  represented 
within  the  computer  that  drives  the  display.  This  means  that  very 
l  simple  transformations  can  be  applied  to  the  data  to  change  the 

I  viewpoint  from  which  it  is  seen,  or  its  scale,  and  these  types  of 

|  transformations  can  also  be  applied  independently  to  any  specified 

subset  of  the  data  as  well.  Thus,  although  the  observer  cannot  walk 
I  around  the  displayed  object  to  view  it  from  behind,  he  can  have  the 

I  computer  rotate  the  contents  of  the  display,  or  part  of  it,  thus 

i  achieving  the  same  effect. 

|  Experiments  on  Orientation 


One  of  the  primary  application  areas  for  displays  such  as 
SpaceGraph  will  be  to  present  to  an  operator  information  about  the 
relative  position,  orientation,  and  movement  of  vehicles  under  his 
control  —  such  a3  an  integrated  plan  position  indicator  and  vertical 
situation  display  for  an  Air  Traffic  Controller.  On  the  basis  of 
information  gleaned  from  the  display,  the  operator  will  make  decisions 
relating  to  the  control  of  the  vehicles.  It  -.s  important  to  know  how 
quickly  such  information  can  be  obtained  from  the  display,  what  sorts 
of  confusions  are  likely  to  occur,  and  how  best  to  present  the 
information  so  as  to  minimize  control  errors.  The  area  we  have  chosen 
to  address  first  concerns  the  identification  of  direction  (up,  down, 
left,  right,  etc.)  for  an  object  presented  in  an  arbitrary  orientation. 

The  control  movements  and  stimulus  images  used  are  sketched  in 
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Figure  1.  Ths  stimulus  Image  consisted  of  an  outline  cube  with  sides 
about  12  cm  long.  An  orientation  cue  was  drawn  on  its  bottom  face, 
consisting  of  a  capital  letter  V  with  its  apex  almost  touching  the 
front  edge.  One  of  the  other  five  sides,  chosen  randomly  on  each  trial 
of  an  experiment,  was  marked  with  a  "stimulus  button"  consisting  of  two 
concentric  circles.  The  observer’s  task  was  to  decide  which  face  was 
so  marked. 


Figure  2:  Schematic  illustration  of  the  appearance  of  the 
stimulus  cube  on  a  typical  trial. 


Figure  2  presents  a  schematic  illustration  of  what  the  observer  saw  on 
a  typical  trial:  the  correct  response  here  is  the  right  button.  The 
first  four  experiments  studied  the  effects  of  varying  the  orientation 
in  which  the  stimulus  cube  was  presented.  In  each  of  the  first  three 
experiments,  all  the  cube  orientations  seen  by  the  observer  were 
obtained  by  rotating  the  cube  about  one  of  its  major  axes:  about  the 
vertical  Y-axi3  in  Experiment  1 ,  about  the  depth  or  Z-axis  in 
Experiment  2,  and  about  the  lateral  X-axis  in  Experiment  3.  In 
Experiment  4,  rotations  about  any  one  of  the  three  axes  occurred.  Only 
the  first  experiment  will  be  described  in  any  detail  here.  A  more 
extended  description  can  be  found  in  a  recent  report  (Huggins  i  Getty, 
1981). 

ExperiagnkJ. 

In  the  first  experiment,  the  cube  could  appear  in  any  one  of  24 
different  orientations,  which  together  comprise  a  complete  rotation  of 
the  cube  image  about  its  vertical  Y-axis  in  15  degree  steps,  starting 
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at  a  "canonical"  orientation,  which  was  rotated  10  degrees  from  the 
head-on  orientation,  to  avoid  a  problem  inherent  in  SpaceGraph  with 
drawing  lines  strictly  parallel  to  the  virtual  image  of  the  CRT  face. 


Figure  3:  The  24  orientations  of  the  stimulus  cube  used  in 
Experiment  1,  in  which  rotation  was  around  the  cube's  vertical 
or  Y-axis.  All  five  stimulus  "keys"  are  shown  in  each  view,  to 
cive  space.  On  each  trial,  the  observer  saw  only  one  of  the 
five  stimulus  key3. 


On  a  single  trial,  the  observer  might  see  the  cube  in  any  of  the  24 
orientations,  and  any  one  of  the  five  buttons  might  be  showing, 
yielding  a  total  of  120  distinct  trials.  The  24  orientations  are  shown 
in  Figure  3,  which  is  a  paste-up  made  from  photographs  of  the  actual 
images.  The  images  shown  in  the  figure  differ  in  two  respects  from 
those  seen  by  the  observers: 


o  All  five  buttons  appear  in  each  illustrated  image,  for 

economy,  whereas  observers  never  saw  more  than  one  button  on  a 
single  trial. 

o  The  photographs  are  flat,  2-D  representations  of  the  3-D 
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images,  and  are  quite  hard  to  interpret  in  depth  because  they 
contain  none  of  the  usual  cues  to  depth.  When  the 
corresponding  3-D  images  were  presented  on  SpaceGraph,  their 
depth  was  immediately  and  unambiguously  apprehended. 

Procedure .  The  same  three  observers ^  served  in  all  experiments. 

To  begin  a  trial,  the  observer  pressed  two  "Ready"  buttons,  one  with 
each  hand.  Two  seconds  later  the  stimulus  cube  appeared,  with  a  button 
showing  on  one  of  its  five  faces  excluding  the  bottom  (which  bore  the 
V).  The  observer  held  down  the  ready  buttons  until  he  had  decided 
which  face  of  the  stimulus  cube  was  marked  with  a  button,  and  then 
pressed  the  physical  button  on  the  corresponding  face  of  the  response 
cube  as  fast  as  possible,  while  minimizing  errors.  Each  observer 
served  for  six  sessions,  with  a  maximum  of  two  experimental  sessions  in 
a  day,  and  a  rest  of  at  least  30  minutes  between  sessions.  Each 
session  lasted  30-40  minutes.  The  data  from  the  first  of  the  six 
sessions  was  discarded,  to  reduce  familiarization  effects. 

Two  times  were  recorded  for  each  trial:  the  "reaction"  time,  from 
the  presentation  of  the  stimulus  cube  to  the  release  of  the  ready  key 
by  the  hand  that  then  made  the  response,  and  the  "movement"  time,  from 
the  release  of  the  ready  key  to  the  depression  of  the  response  key. 
After  the  subjects  had  become  familiar  with  the  procedure,  movement 
times  were  virtually  unaffected  by  the  orientation  in  which  the 
stimulus  cube  appeared,  although  they  varied  slightly  for  the  five 
different  responses.  Error  rates,  also,  were  very  low:  about  2%  of 
trials  in  the  initial  sessions,  and  falling  to  about  0.5J&  thereafter. 

Results.  Mean  reaction  times  pooled  across  observers  are  plotted 
for  each  of  the  five  responses  as  a  function  of  stimulus  cube 
orientation  in  Figure  4.  In  all  the  plots,  the  abscissa  values 
represent  rotations  away  from  the  head-on  orientation.  Thus,  the 
canonical  orientation  is  represented  by  the  data  points  immediately  to 
the  right  of  the  vertical  dotted  lines  at  zero  rotation.  Data  points 
for  up  to  one  quarter  revolution  in  each  direction  from  the  head-on 


^During  the  Symposium,  the  generality  of  3-D  display  results  obtained 
from  small  numbers  of  subjects  was  questioned.  A  range  of  figures  was 
quoted,  some  quite  low,  for  the  proportion  of  the  population  who  have 
stereopsis  (but  see  (Newhouse  4  Uttal,  1982)  for  conflicting  evidence). 
These  may  be  valid  objections.  On  the  other  hand,  we  have  never  heard 
of  any  viewer  of  the  SpaceGraph  display  (of  the  hundreds  who  have  seen 
it,  including  some  who  were  known  not  to  have  stereopsis),  who  did  not 
immediately  and  effortlessly  perceive  the  image  in  3-D.  This  should 
not  be  surprising,  since  the  image  itself  was  truly  volumetric. 

However,  since  this  evidence  is  only  anecdotal  and  informal,  it  may  be 
appropriate  to  support  it  more  formally. 
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view  are  duplicated  at  the  left  and  right  sides  of  the  plot,  to  make 
the  symmetry  of  the  minimum  at  zero  rotation  n-^re  apparent.  To  reduce 
the  noise  in  the  plotted  data,  we  applied  boxcar  smoothing2  to  each 
point  before  plotting  it. 


p 

In  boxcar  smoothing,  a  plotted  point  represents  the  average  of  the 
true  data  point  for  that  abscissa  value  with  the  two  immediately 
adjacent  values. 
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the  TOP  responses;  those  for  the  NEAR  and  FAR  responses;  and  those  for 
the  LEFT  and  RIGHT  responses.  (1)  The  function  for  the  TOP  responses 
is  essentially  flat  at  about  740  ms,  showing  that  this  reaction  was  not 
affected  by  the  orientation  of  the  stimulus  cube.  (2)  The  functions 
for  the  NEAR  and  FAR  responses  exhibit  plateaus.  The  reaction  time  is 
lowest  (about  800  ms)  at  the  head-on  orientation,  and  rises  roughly 
linearly  with  rotation  away  from  the  head-on  orientation  until  the 
plateau  is  reached  for  rotations  more  than  about  a  quarter  revolution 
from  the  head-on  orientation.  The  plateau  is  at  about  860  ms  for  the 
NEAR  reaction  and  890  ms  for  the  FAR  reaction.  (3)  The  third  group  of 
functions,  the  LEFT  and  RIGHT  responses,  exhibit  much  more  dramatic 
effects  of  orientation.  The  LEFT  and  RIGHT  reactions,  like  the  NEAR 
and  FAR,  show  minima  of  about  830  and  890  ms,  respectively,  near  to  the 
head-on  orientation.  (The  elevated  minimum  for  the  LEFT  response  is 
probably  due  to  the  different  response  expectancies  for  this  response: 
for  all  three  observers  it  was  the  only  response  made  with  the  left 
hand.)  For  both  the  LEFT  and  RIGHT  responses,  reaction  time  increased 
rapidly  and  roughly  linearly  with  rotation  away  from  the  head-on 
position,  reaching  ragged  peaks  of  about  1150  ms  at  one-half  revolution 
from  the  head-on  orientation. 

Discussion.  We  suggest  the  following  explanations  for  the  shapes 
of  the  three  types  of  function.  Observers  use  several  different 
strategies  for  determining  which  face  of  the  cube  is  marked.  The 
strategies  differ  both  in  their  efficiency,  and  in  the  conditions  under 
which  they  can  be  applied.  In  conditions  under  which  more  than  one 
strategy  can  be  applied,  observers  apparently  follow  both  strategies  in 
parallel,  and  accept  whichever  result  becomes  available  first. 

Three  main  strategies  were  used,  as  follows: 

The  Spatial  Strategy:  Consider  first  the  TOP  stimulus  key.  With 
rotation  about  the  vertical  axis,  neither  the  position  within  the 
retinal  image  nor  the  retinal  shape  of  the  TOP  key  changed  as  the 
orientation  of  the  stimulus  cube  was  altered.  Furthermore,  the  spatial 
loci  of  all  the  other  stimulus  keys  in  the  retinal  image,  in  all 
orientations  used  in  the  experiment,  were  well  separated  from  that  of 
the  TOP  key.  Therefore,  observers  were  able  to  use  a  highly  efficient 
and  compatible  spatial  mapping  strategy  for  selecting  the  TOP  response, 
independent  of  cube  orientation,  and  this  resulted  in  the  fast  flat 
reaction  time  function. 

The  Rotational  Strategy:  As  can  be  seen  from  Figure  3,  the  24 
stimulus  cube  orientations  fall  into  four  groups  (corresponding  to  the 
four  rows  of  the  figure),  with  the  same  six  images  appearing  in  each 
group.  The  only  difference  between  the  four  images  in  a  single  column 
of  the  figure  lies  in  the  different  orientation  of  the  V  on  the  cube's 
bottom  face.  Thus,  for  the  four  keys  on  the  vertical  side3  of  the 
cube,  the  only  information  that  specifies  which  face  of  the  cube  bears 
the  stimulus  key  is  the  position  of  the  V  relative  to  the  stimulus  key. 
With  respect  to  the  V,  the  four  stimulus  keys  fall  into  two  classes, 
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with  the  LEFT  and  RIGHT  keys  in  one  class  and  the  NEAR  and  FAR  keys  in 
the  other. 

The  V  exhibits  lateral  symmetry,  so  although  a  stimulus  key 
appearing  to  the  side  of  the  V  can  be  quickly  identified  as  either  the 
LEFT  or  the  RIGHT  key,  picking  the  correct  one  is  not  easy.  The 
orientation  of  the  key  relative  to  the  V  must  be  determined  in  detail 
before  the  choice  between  them  can  be  made.  Subjective  reports  suggest 
that  the  observers  imagined  themselves  looking  from  the  apex  of  the  V 
towards  its  points,  and  then  deciding  whether  the  displayed  key  was  on 
the  left  or  the  right.  This  implies  a  sort  of  mental  rotation, 
although  it  is  the  observer's  body  image  that  is  rotated  rather  than 
the  stimulus  image.  The  possibility  that  observers  perform  some  sort 
of  mental  rotation  gains  some  credibility  from  the  strong  similarity 
between  the  shape  of  the  reaction  time  functions  for  the  LEFT  and  RIGHT 
responses  and  those  obtained  by  (Cooper  A  Shepard,  1973)  and  by 
(Hintzman,  O'Dell,  A  Arndt,  1981),  in  earlier  studies  of  mental 
rotation.  The  slight  minimum  at  one-half  revolution  from  the  head-on 
position  for  the  RIGHT  response  was  much  more  pronounced  in  the  data  of 
one  observer,  who  reported  that  in  these  orientations  he  recognized  the 
cube  as  being  reversed,  and  chose  his  response  on  the  basis  of  the 
spatial  strategy,  and  then  reversed  it. 

With  regard  to  the  NEAR  and  FAR  stimulus  keys,  the  same  rotational 
strategy  could  be  applied.  That  is,  the  observer  could  mentally  rotate 
his/her  body  image  into  alignment  with  the  V,  and  then  make  a  NEAR  or 
FAR  response  according  to  the  near  or  far  location  of  the  stimulus  key. 
This  rotational  strategy  would  account  for  the  sloping  3kirts  of  the 
NEAR  and  FAR  response  functions  for  orientations  near  the  head-on 
position.  However,  the  flat  plateaus  observed  in  the  NEAR  and  FAR 
functions  over  much  of  the  range  of  orientations  probably  reflects  the 
use  of  a  second,  non-rotational  strategy. 

The  Relational  Strategy:  The  V  is  asymmetric  about  its  horizontal 
bisector,  so  the  NEAR  and  FAR  keys  are  uniquely  coded  by  their  relation 
to  the  V.  The  apex  of  the  V  always  points  towards  the  NEAR  key,  and  the 
open  end  of  the  V  always  points  towards  the  FAR  key.  The  coding  of  the 
NEAR  key  is  slightly  more  direct  than  that  of  the  FAR  key,  because  the 
apex  of  the  V  almost  touches  the  NEAR  face  of  the  cube,  whereas  the 
points  of  the  V  do  not  reach  far  enough  to  touch  the  FAR  face. 

Secondly,  the  V  is  easily  interpreted  symbolically  as  an  arrow  pointing 
towards  the  NEAR  key,  which  has  a  high  compatibility.  This  means  that 
the  NEAR  and  FAR  responses  can  always  be  selected  by  a  relational 
strategy:  if  the  V  is  pointing  towards  the  stimulus  key,  press  the  NEAR 
button;  and  if  it  is  pointing  away,  press  the  FAR  button.  Dse  of  this 
relational  strategy  would  yield  reaction  times  that  are  relatively 
independent  of  cube  orientation  and,  except  for  orientations  near  the 
head-on  position,  apparently  faster  than  those  obtained  with  the 
rotational  strategy.  It  is  not  clear  whether  the  observer  first 
processes  cube  orientation  in  order  to  choose  between  the  rotational 
and  the  relational  strategies,  or  pursues  both  strategies  in  parallel 
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(Woods,  1974).  In  the  latter  case,  the  response  would  be  determined  by 
whichever  strategy  first  produced  a  decision,  and  the  observed 
composite  reaction  time  functions  would  reflect  the  fact  that  each 
strategy  wins  over  only  part  of  the  cube's  rotational  cycle. 

In  either  case,  the  utility  of  relational  strategies  obviously 
depends  on  the  cue  used  to  mark  the  orientation  of  the  cube.  In  the 
present  experiment,  the  symbol  chosen  displayed  left-right  symmetry  but 
not  top-bottom  symmetry.  Furthermore,  it  was  placed  on  the  bottom  face 
of  the  cube,  symmetrically  between  the  left  and  right  faces,  but 
asymmetrically  between  the  front  and  back  faces  (that  is,  the  V  was 
placed  between  the  center  of  the  bottom  face  and  its  NEAR  edge).  Thus 
two  distinct  relational  strategies  could  be  used  for  the  NEAR  and  FAR 
responses,  one  based  on  the  position  of  the  V  and  the  other  on  its 
shape:  the  V  appeared  on  the  NEAR  half  of  the  bottom  face,  and  its  apex 
also  pointed  towards  the  NEAR  face. 

A  relational  strategy  could  have  been  used  to  select  the  TOP 
response,  since  the  TOP  key  always  appeared  on  the  face  opposite  the 
V.  This  strategy  would  have  depended  only  on  the  position  of  the  V,  and 
not  on  its  shape.  However,  this  strategy  was  probably  not  used  because 
the  spatial  strategy  consistently  gave  faster  decisions.  No  relational 
strategy  was  possible  for  the  LEFT  or  RIGHT  responses,  because  the  LEFT 
and  RIGHT  stimulus  keys  were  placed  symmetrically  with  respect  to  both 
the  position  and  the  shape  of  the  V.  Had  a  different  letter  been 
chosen,  such  as  an  E  or  an  F,  or  had  the  cue  been  placed  left-right 
asymmetrically,  then  a  relational  strategy  could  have  been  used  for 
these  responses  also.  On  the  other  hand,  the  choice  of  an  E,  with  its 
up-down  symmetry,  would  have  prevented  a  relational  strategy  for  the 
NEAR  and  FAR  responses  based  on  the  shape  of  the  letter,  although  one 
based  on  its  position  would  still  have  been  available. 


The  second  and  third  experiments  were  virtually  identical  with  the 
first,  except  that  the  stimulus  cube  was  rotated  about  the  Z,  or  depth 
axi3,  in  Experiment  2,  and  about  the  X,  or  lateral  axis,  in  Experiment 
3. 

The  hypotheses  proposed  above  to  explain  the  results  obtained  with 
rotation  about  the  Y-axi3  make  the  following  predictions  when  the  cube 
is  rotated  about  its  Z-axis  instead.  Since  the  axis  of  rotation  now 
passes  through  the  NEAR  and  FAR  stimulus  keys,  instead  of  the  TOP  key, 
these  two  responses  should  show  the  fast,  flat  response  time  functions 
associated  with  use  of  the  spatial  strategy,  whereas  the  TOP  function 
should  now  show  a  plateau,  appropriate  to  the  use  of  the  relational 
strategy.  The  LEFT  and  RIGHT  functions  should  be  similar  to  those 
obtained  in  Experiment  1,  since  the  same  constraints  apply  as  before. 
That  is,  the  spatial  strategy  cannot  be  applied  to  the  LEFT  snd  RIGHT 
keys,  since  the  locations  of  these  keys  vary  with  cube  orientation,  and 
the  relational  strategy  is  not  effective  either,  because  the  lateral 
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symmetry  of  the  V  complicates  the  choice  between  them.  The  rotational 
strategy  is  the  only  one  remaining. 

A  second  purpose  of  Experiments  2  and  3  was  to  compare  the 
difficulty  of  rotation  about  each  of  the  three  axes:  people  have  much 
more  experience  with  rotations  about  the  vertical,  Y-axis  in  everyday 
life,  and  therefore  one  might  expect  rotation  about  this  axis  to  be 
easiest.  The  procedure  exactly  repeated  that  in  Experiment  1 ,  except 
for  the  changed  stimulus  images.  These  are  shown  in  Figure  5. 


Figure  5:  The  24  orientations  of  the  stimulus  cube  used  in  the 
Z-axis  ta3k.  On  each  trial,  the  observer  saw  only  one  of  the 
five  stimulus  keys. 


Results.  Pooled  reaction  times  are  plotted  for  each  response  in 
Figure  6.  The  function  for  the  TOP  response  is  roughly  plateau-shaped, 
as  predicted,  but  there  is  an  additional  minimum  at  one  half  revolution 
from  the  head-on  orientation,  giving  the  function  an  M-shape.  The 
functions  for  the  LEFT  and  RIGHT  responses  are  sharply  peaked,  and  are 
very  similar  in  shape  to  those  of  Experiment  1,  again  as  predicted. 

The  functions  for  the  NEAR  and  FAR  responses  are  plateau-shaped,  and 
are  very  similar  both  in  overall  shape  and  level  to  the  corresponding 
functions  in  Experiment  1  (Figure  4).  The  prediction  for  these 
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Figure  6:  Mean  decision  times  in  the  Z-axis  task  for  each  of 
the  five  response  buttons,  as  a  function  of  the  orientation  of 
the  stimulus  cube. 


functions  was  not  fulfilled:  a  fast  flat  function  appropriate  to  the 
spatial  strategy  was  expected. 

Discussion.  When  the  stimulus  cube  is  rotated  about  its  depth 
axis,  the  positions  of  the  NEAR  and  FAR  stimulus  keys  remain  invariant 
within  the  image.  However,  observers  were  apparently  not  able  to  use  a 
spatial  coding  strategy  for  the  NEAR  and  FAR  keys,  because,  if  they 
had,  the  reaction  time  functions  would  have  been  flat  and  fast  like 
that  for  the  TOP  response  in  Experiment  1. 

Why  could  a  spatial  strategy  not  be  used  for  the  NEAR  and  FAR 
responses  in  the  present  experiment?  One  possibility  is  that  the 
spatial  strategy  of  Experiment  1  did  not  require  the  image  to  be 
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interpreted  as  a  3-D  object,  but  could  be  applied  directly  to  the  raw 

2- D  retinal  image.  Thus,  however  immediate,  automatic,  and  salient  the 

3- D  percept  was,  the  TOP  response  in  Experiment  1  could  be  made  on  the 
basis  of  a  patch  of  light  at  a  particular  place  in  the  retinal  image. 
Two  aspects  of  the  image  may  have  discouraged  or  prevented  use  of  the 
spatial  strategy  in  Experiment  2.  First,  the  NEAR  and  FAR  stimulus 
keys  appear  very  close  to  each  other  within  the  2-D  retinal  image  (see 
Figure  5)»  so  application  of  the  spatial  strategy  to  the  retinal  image 
requires  a  finer  discrimination  than  was  necessary  to  identify  the  TOP 
key  in  Experiment  1.  Secondly,  the  main  spatial  separation  between  the 
NEAR  and  FAR  stimulus  keys  was  in  the  depth  dimension,  and  apprehending 
this  separation  therefore  required  that  the  image  be  perceived  in  3-D 
before  the  difference  in  depth  between  the  NEAR  and  FAR  keys  became 
available  to  support  response  selection. 

The  similarity  of  the  NEAR  and  FAR  functions  in  Figure  6  to  those 
in  Figure  4  suggests  that  observers  used  the  same  strategy  for  these 
keys  in  the  two  experiments,  and  we  argued  above  that  this  must  be  a 
relational  strategy  that  relied  on  the  V  pointing  towards  the  NEAR  key 
and  away  from  the  FAR  key.  As  before,  we  attribute  the  sloping  skirts 
of  both  functions  to  the  use  of  a  rotational  strategy  near  the  head-on 
orientation.  (Although  the  NEAR  and  FAR  keys  remain  fixed  in  position, 
the  position  of  the  bottom  face,  bearing  the  V,  varies  as  the  cube  is 
rotated  about  its  Z-axis.)  Thus,  for  small  rotations  of  the  cube,  it 
appears  to  be  faster  to  rotate  one’s  body  image  mentally  until  the  face 
with  the  V  appears  on  the  bottom  of  the  cube,  and  then  respond 
spatially,  while  for  larger  rotations  it  is  faster  to  determine  the 
relationship  of  the  stimulus  key  to  the  V  without  mental  rotation. 

The  plateau  shape  of  the  TOP  function  suggests  that  the  TOP  key 
also  was  identified  by  means  of  a  relational  strategy,  except  in 
orientations  close  to  the  upright,  when  a  combination  of  a  small 
rotation  and  the  spatial  strategy  was  used.  The  dip  in  the  T0P 
function  at  one  half  revolution  suggests  that  the  spatial  s  vategy  may 
have  been  applied  here  also,  as  well  as  in  the  upright  orientation. 
Provided  that  either  the  key  or  the  V  was  on  the  uppermost  face,  and 
the  other  was  on  the  lower  face,  the  TOP  response  could  be  correctly 
chosen  without  deciding  which  was  which. 

The  shape  of  the  LEFT  and  RIGHT  functions  suggest  that  the  same 
difficulties  were  encountered  in  choosing  these  responses  as  in 
Experiment  1.  Because  of  the  lateral  symmetry  of  the  V,  a  relational 
strategy  was  not  effective,  and  the  rotational  strategy  was  adopted  by 
default.  As  before,  reaction  times  increased  dramatically  with 
increasing  rotation  away  from  the  head-on  orientation. 


In  Experiment  3*  the  cube  was  rotated  about  its  lateral  I-axis. 

The  spatial  strategy  now  applies  to  the  LEFT  and  RIGHT  responses,  since 
the  position  of  these  keys  remains  fixed  in  the  image.  On  the  other 
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hand,  the  TOP,  NEAR,  and  FAR  keys  require  the  use  of  a  relational 
strategy,  since  each  appears  in  an  unambiguous  spatial  relationship  to 
the  V.  Accordingly,  the  reaction  time  functions  should  be  plateaus  with 
sloping  skirts  near  to  the  head-on  orientation,  as  found  earlier  with 
the  relational  strategy.  The  procedure  exactly  repeated  the  earlier 
experiments,  except  for  the  stimulus  images,  which  are  shown  in  Figure 
7. 


Figure  7s  The  24  orientations  of 
X-axis  task.  On  each  trial,  the 
five  stimulus  keys. 


the  stimulus  cube  used  in  the 
observer  saw  only  one  of  the 


Results.  The  pooled  reaction  time  functions  for  each  of  the 
responses  are  plotted  in  Figure  8.  The  functions  for  the  LEFT  and 
RIGHT  responses  are  fast  and  almost  flat,  showing  that  rotation  of  the 
cube  image  about  its  lateral  axis  did  not  have  any  effect  on  the  time 
taken  to  identify  these  faces.  The  NEAR  and  FAR  response  functions 
have  pronounced  peaks  at  about  one  half  revolution  from  the  head-on 
orientation,  where  the  reaction  times  are  about  1100  ms  (NEAR)  and  1200 
ms  (FAR).  In  each  function,  there  are  subsidiary  minima  at  about  one 
quarter  and  three  quarters  of  a  revolution  from  the  canonical 
orientation.  The  function  for  the  TOP  response  is  similar  to  those  for 
the  NEAR  and  FAR  responses,  except  that,  instead  of  a  peak  at  one  half 
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revolution  from  the  canonical  orientation,  there  is  a  minimum.  This 
gives  the  function  an  oscillatory  character,  with  pronounced  minima  at 
the  four  quarter-revolution  orientations,  and  pronounced  maxima  at 
intervening  orientations. 


Figure  8:  Mean  decision  times  in  the  X-axis  task  for  each  of 
the  five  response  buttons,  as  a  function  of  the  orientation  of 
the  stimulus  cube. 


Discussion.  When  the  cube  was  rotated  about  its  lateral  axis,  all 
the  conditions  were  met  for  the  retinal  spatial  strategy  to  apply 
successfully  to  the  LEFT  and  RIGHT  responses.  The  flatness  of  the 
reaction  time  functions  for  these  two  responses,  and  the  similarity  of 
these  times  (720  ms  and  750  ms  respectively)  to  the  times  for  the  TOP 
response  in  Experiment  1  (730  ms),  support  the  conclusion  that  the 
spatial  strategy  was  in  fact  used. 
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The  TOP  reaction  time  function  passes  through  four  distinct  maxima 
and  minima  as  the  cube  image  completes  a  single  revolution  about  its 
lateral  axis.  The  minima  correspond  to  images  in  which  the  cube  is 
seen  with  its  edges  aligned  vertically  and  horizontally.  A  sinusoidal 
shape  similar  to  the  shape  of  the  TOP  function  is  obtained  by  plotting 
the  distance  between  the  TOP  key  and  its  nearest  neighbor  in  the 
retinal  image.  Several  details  in  the  shapes  of  the  functions  may  be 
explainable  in  terms  of  such  properties  of  the  images  involved,  but  the 
evidence  is  insufficient  to  warrant  firm  conclusions. 


The  functions  for  the  NEAR  and  FAR  responses  are,  with  minor 
exceptions,  very  similar  to  those  for  the  LEFT  and  RIGHT  responses  in 
Experiment  1  (Figure  4),  which  were  associated  with  the  rotational 
strategy.  This  conflicts  with  our  expectations  for  the  NEAR  and  FAR 
responses  in  the  present  experiment.  We  expected  the  relational 
strategy  to  be  applied,  since  the  NEAR  stimulus  key  is  always  "pointed 
to"  by  the  V,  and  the  FAR  key  is  "pointed  away  from."  But  the 
functions  obtained  are  not  the  plateau-shaped  functions  we  associated 
with  use  of  the  relational  strategy.  Rather,  they  repeat  the  peak¬ 
shaped  functions  found  in  Experiment  1  for  the  LEFT  and  RIGHT 
responses,  which  we  ascribed  to  use  of  a  rotational  strategy. 

Comparison  of  Figures  3  and  7  suggests  that  the  "pointing"  aspect  of 
the  V  was  much  easier  to  apprehend  when  the  cube  was  rotated  about  its 
vertical  axis  (Figure  3)  than  when  it  was  rotated  about  its  lateral 
axis  (Figure  7).  In  the  former  case,  the  V  always  lay  in  a  true 
horizontal  plane  near  the  bottom  of  the  image,  whereas  its  position  was 
much  less  predictable  in  the  latter.  This  may  have  made  it  much  harder 
to  apply  a  relational  strategy  in  the  present  experiment  than  in 
Experiment  1,  leading  to  adoption  of  a  rotational  strategy  instead. 

Experiment  Jt 


In  typical  real-life  applications  of  a  display  such  as  SpaceGraph, 
objects  will  likely  appear  in  arbitrary  orientations.  Dsers  of  the 
display  will  therefore  not  be  able,  in  general,  to  use  strategies  that 
capitalize  on  properties  of  a  particular  set  of  orientations,  such  as 
the  fact  that  all  orientations  represent  rotation  about  a  single  axis. 
In  two  of  the  foregoing  experiments,  a  direct  spatial  encoding  strategy 
could  be  used  to  select  one  or  two  of  the  available  responses.  In  the 
wider  context,  the  spatial  strategy  will  be  less  effective.  Its  use 
will  be  appropriate  in  only  some  orientations,  and  its  applicability 
must  be  determined  before  it  can  be  applied. 

Procedure.  Experiment  4  was  similar  to  the  earlier  experiments, 
except  that  the  stimulus  ensemble  included  rotations  about  each  of  the 
three  axes.  Eight  stimulus  cube  orientations  were  selected  from  each 
of  the  three  earlier  experiments,  representing  between  them  a  complete 
rotation  of  the  cube  image  about  each  of  the  three  major  axes  of  the 
cube  by  equal  increments  of  45  degrees.  The  24  images  correspond  to 
the  first  and  fourth  columns  of  images  that  appeared  in  Figures  3,  5, 
and  7.  Of  these  24  images,  only  the  22  different  ones  were  used  in  the 


WWS. 


45 


Huggins  &  Getty 

experl uent:  the  repetitions  of  the  canonical  orientation  were  omitted. 
The  procedure  was  again  identical  to  that  of  the  first  three 
experiments,  except  that  there  were  only  110  different  stimuli  (5 
stimulus,  keys  x  22  orientations)  in  each  block  instead  of  120. 

Results.  The  reaction  time  functions  for  the  15  combinations  of 
the  5  responses  and  the  three  rotation  axes  are  shown  in  Figure  9.  The 
rows  of  panels  correspond  to  the  5  responses,  and  the  columns  show  the 
effects  of  rotations  about  the  Y,  2,  and  X  axes.  Two  functions  appear 
in  each  panel.  The  dashed  line  represents  the  reaction  times  obtained 
in  the  present  experiment,  with  "mixed"  trials  (rotation  about  anv  one 
of  the  three  axes),  and  the  solid  line  represents  the  reaction  times 
obtained  for  the  identical  stimuli  in  Experiment  1,  2,  or  3?  with 
"pure"  trials  (rotation  about  only  one  of  the  three  axes,  a  different 
one  in  each  experiment).  Each  panel  is  labeled  with  a  letter  to 
simplify  reference.  Boxcar  averaging  was  not  applied  to  the  data 
points  in  Figure  9. 

Three  different  relationships  occur  between  the  results  obtained 
with  pure  rotations  (solid  functions)  and  with  mixed  rotations  (dashed 
functions).  In  four  panels  (B,  K,  L,  M)  there  are  only  minor 
differences  between  the  two  functions;  and  m  six  more  (C,  D,  E,  F,  I, 
J)  there  are  only  minor  differences  except  at  the  half-revolution 
orientations,  where  the  mixed  functions  show  a  marked  peak.  In  the 
remaining  five  panels  (A,  G,  H,  »!,  0),  the  reaction  times  are  longer  in 
the  mixed  condition  (dashed  function)  than  in  the  pure  condition  (solid 
function)  at  all  orientations,  the  differences  ranging  from  50-100  m3 
at  the  canonical  orientation  to  over  a  second  at  the  reversed 
orientations.  Of  these  latter  five  panels,  three  (A,  N,  and  0) 
correspond  to  conditions  in  which  the  spatial  strategy  was  applied  in 
Experiments  1  and  3.  The  other  two  (G  and  H)  correspond  to  conditions 
in  which  the  spatial  strategy  was  expected  to  apply  in  Experiment  2. 
Thus,  the  reaction  times  obtained  in  the  present  experiment  were  quite 
similar  to  those  obtained  in  the  earlier  experiments,  except  that 

(a)  the  peaks  at  the  reversed  orientations  were  much  more  pronounced, 
and  (b)  no  fast,  flat  reaction  time  functions  were  obtained  like  those 
associated  with  the  spatial  strategy. 

Discussion.  Consider  first  the  pairs  of  functions  for  the  LEFT 
and  RIGHT  responses  shown  in  the  six  lower  panels  of  Figure  9.  There 
is  a  striking  similarity  among  the  skirts  of  10  of  the  12  functions, 
the  exceptions  being  the  solid  functions  in  panels  N  and  0  where  the 
spatial  strategy  was  applied.  Tin  re  are  two  aspects  to  the  similarity. 
First,  setting  aside  for  the  moment  the  data  points  at  the  half¬ 
revolution  orientations,  in  panels  D,  E,  I,  and  J  each  solid  function 
is  .similar  to  the  dotted  function  in  the  same  panel.  This  suggests 
thoc  observers  used  the  same  strategy  in  the  pure-  and  mixed-axis 
conditions  for  the  LEFT  and  RIGHT  responses.  Second,  (a)  the  solid 
functions  in  these  same  panels  are  very  similar  to  each  other,  and 

(b)  the  mixed  functions  in  the  same  panels  (and  also  those  in  panels  N 
and  0)  are  also  very  similar  to  each  other,  again  excluding  the  data 
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Figure  9:  Mean  decision  times  in  the  XYZ-task  (dashed  lines 
joining  open  data  points)  are  compared  with  mean  decision  times 
obtained  on  the  same  stimuli  in  Experiment  1,  2,  or  3  (solid 
lines  joining  filled  data  points)  as  a  function  of  orientation. 
Functions  are  plotted  separately  for  each  of  the  five  responses 
(rows),  and  for  rotations  about  the  Y,  Z,  and  X  axes  (columns). 
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points  at  the  peaks.  This  suggests  that  the  amount  of  rotation  was 
more  important  than  which  axis  it  was  about.  These  observations  lead 
to  a  striking  conclusion:  that  the  basic  strategy  for  determining  the 
LEFT  and  RIGHT  responses  was  similar  in  all  the  experiments,  except  in 
Experiment  2,  where  the  spatial  strategy  could  be  used  because  the 
rotational  axis  passed  through  the  LEFT  and  RIGHT  keys.  Because  of  the 
peaked  shape  of  all  of  these  functions,  we  believe  the  common  strategy 
was  rotational.  The  effect  on  reaction  times  of  rotating  the  cube  away 
from  the  canonical  orientation  was  very  similar  for  all  three  of  the 
rotation  axes,  both  for  the  pure  and  the  mixed  conditions  This  result 
was  quite  unexpected. 

The  only  orientations  where  major  differences  occurred  between  the 
present  experiment  and  the  earlier  pure-axis  rotations  were  those  one 
half  revolution  away  from  the  canonical  orientation.  Here,  the 
reaction  times  for  the  LEFT  and  RIGHT  responses  were  very  much  longer 
than  before.  There  is  an  obvious  explanation  for  the  peaks  in  the  7- 
axis  and  the  X-axis  functions.  The  two  images  involved  were  the  only 
two  in  which  the  cube  image  was  inverted,  with  the  V  on  its  uppermost 
face.  When  the  cube  was  inverted  by  rotation  around  the  Z— axis,  the 
apex  of  the  V  pointed  towards  the  observe"  and  the  stimulus  key  on  the 
left  of  the  image  required  a  RIGHT  response.  When  the  cube  wts 
inverted  by  rotation  about  the  X-axis,  on  the  other  hand,  the  V  pointed 
away  from  the  observer  and  the  stimulus  key  on  the  left  of  the  image 
required  a  LEFT  response.  Furthermore,  this  relationship  between  the 
direction  of  the  V  and  the  reversal  of  the  spatial  coding  was  the 
opposite  of  that  applying  when  the  cube  was  rotated  about  the  Y-axis. 
That  is,  when  the  cube  was  inverted,  the  spatial  strategy  could  be 
applied  directly  if  the  V  pointed  away  from  the  observer.  But  when  the 
cube  was  upright,  with  the  V  on  the  under  face,  the  spatial  strategy 
had  to  be  reversed  if  the  V  pointed  a-.'ay  f,  om  the  observer.  The  fact 
that  peaks  occurred  also  in  the  mixed  functions  ir.  panels  D  and  E 
suggests  that  observers  were  influenced  by  this  inconsistency.  Similar 
factors  may  account  for  the  much  smaller  peaks  that  occurred  in  the 
mixed  functions  in  panels  C  and  F.  The  observer  could  resolve  this 
quandary  by  rotating  the  cube  mentally  about  the  Z-  or  the  X-axis, 
whichever  was  appropriate.  The  observers  reported  great  uncertainty  in 
choosing  which  axis  to  rotate  about-  This  indecision,,  possibly 
accompanied  by  unsuccessful  initial  rotations  about  the  wrong  axis,  may 
well  account  for  the  considerably  lengthened  reaction  times  to  the 
half-rotated  cubes. 

Experiments  on  Cue  Ambiguity 

The  foregoing  experiments  involved  measuring  how  long  an  observer 
took  to  identify  the  marked  face  of  the  stimulus  cube,  using  an 
orientation  cue  whose  shape  and  location  exhibited  symmetries  with 
respect  to  one  pair  of  responses,  but  not  to  another.  The  hypotheses 
proposed  to  explain  the  experimental  findings  involve  strategies  that 
depend  on  these  symmetries.  The  hypotheses  can  be  simply  tested  by 
altering  the  relations  between  the  symmetries  and  the  pairs  of 
responses. 
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Thus,  the  aim  of  Experiments  5  and  6  was  to  find  out  whether 
appropriate  coding  of  the  stimulus  objects  within  the  display  can 
significantly  reduce  the  substantial  display-control  incompatibility 
found  in  the  earlier  studies. 


In  Experiment  5,  the  orientation  cue  was  again  a  left-right 
symmetric  capital  letter  drawn  on  the  bottom  face  of  the  cube.  The 
letter  was  an  A,  rather  than  a  T,  to  minimize  possible  interference 
from  the  earlier  tasks  while  retaining  the  symmetry  of  the  V.  Second, 
the  4  appeared  with  its  base  almost  touching  the  left  edge  rather  than 
the  near  edge  of  the  cube  image.  The  shape  and  the  position  of  this 
orientation  cue  are  asymmetric  with  respect  to  the  LEFT  and  RIGHT 
stimulus  keys,  but  symmetric  with  respect  to  the  NEAR  and  FAR  keys, 
reversing  the  relationships  of  Experiment  1.  Figure  10  is  a  schematic 
illustration  of  the  stimulus  cube  as  it  might  appear  on  a  typical 
trial. 


Figure  10:  Schematic  diagram  of  the  stimulus  cube  for  a  typical 
trial  in  Experiment  5,  which  was  identical  with  Experiment  1 
except  that  both  the  shape  and  the  orientation  of  the 
orientation  cue  was  changed.  The  new  orientation  cue  was  a 
capital  letter  A  with  its  base  on  the  left  edge  of  the  bottom 
face. 


The  procedure  used  was  identical  to  that  in  Experiment  1:  in 
fact,  exactly  the  same  stimulus  sequence  was  followed,  to  maximize  the 
comparability  of  the  results.  The  cube  image  appeared  in  24 
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orientations  constituting  a  complete  revolution  around  the  cube's 
vertical  Y-axis,  and  the  same  three  observers  served. 

Results  and  Discussion.  The  results  are  presented  in  Figure  11. 


Figure  11:  Mean  reaction  times  in  the  Y-axis  task  for  each  of 
the  five  response  buttons,  as  a  function  of  the  orientation  of 
the  stimulus  cube,  using  a  capital  A  on  the  left  edge  of  tho 
bottom  face  as  orientation  cut 


The  reaction  times  for  each  c. '  irtu  f:  va  responses  are  plotted  as  a 
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function  of  the  orientation  of  the  stimulus  cube,  expressed  as  a 
rotation  away  from  the  head-cn  orientation.  As  predicted,  the 
asymmetric  position  and  shape  of  the  A  made  it  easy  to  distinguish 
between  LEFT  and  RIGHT  stimulus  keys,  since  the  foot  of  the  A  almost 
touched  the  LEFT  face  of  the  stimulus  cube,  and  its  apex  pointed 
towards  the  RIGHT  face.  Observers  were  able  to  U3e  a  relational 
strategy  to  distinguish  between  these  responses,  and  the  LEFT  and  RIGHT 
reaction  time  functions  show  plateaus  similar  to  those  in  the  NEAR  and 
FAR  functions  in  Experiment  1,  On  the  other  hand,  the  NEAR  and  FAR 
stimulus  keys  could  not  be  distinguished  on  the  basis  of  the  asymmetry 
of  either  the  position  or  the  shape  of  the  A.  Observers  were  forced  to 
adopt  a  rotational  strategy,  which  resulted  in  reaction  time  functions 
that  were  sharply  peaked  at  the  half-rotation  orientations.  Thu3 
changing  the  orientation  cue  from  a  V  on  the  bottom  face  pointing  at 
the  cube's  front  edge  into  an  A  on  the  bottom  face  standing  on  the 
cube's  left  edge  reversed  the  types  of  function  associated  with  the 
LEFT/RIGHT  and  the  NEAR/FAR  pairs  of  responses. 


E&PSrlBfillfc,  6. 


Experiment  6  also  involved  only  a  change  in  the  orientation  cue: 
this  time,  the  cue  was  made  asymmetric  in  shape  relative  to  both  pairs 
of  responses.  A  modified  letter  T  was  used,  drawn  on  the  bottom  face 
of  the  stimulus  cube  with  its  apex  almost  touching  the  front  edge  as 
before,  but  with  an  additional  cross-bar  or  serif  added  at  the  top  and 
to  the  left  of  the  left  arm  of  the  V,  as  illustrated  schematically  in 
Figure  12. 


The  serif  made  the  V  left/right  asymmetric  as  well  as  up/down 
asymmetric.  The  serif  pointed  towards  the  LEFT  key  and  away  from  the 
RIGHT  key,  and  as  before  the  apex  of  the  V  pointed  towards  the  NEAR  key 
and  away  from  the  FAR  key.  The  same  procedure  was  followed,  with  the 
stimulus  cube  appearing  in  24  orientations  constituting  a  revolution 
about  the  vertical  Y-axis,  and  the  same  three  observers  served. 


Results  and  Discu33lon.  The  five  reaction  time  functions  are 
shown  in  Figure  13.  All  four  of  the  confusable  responses  yielded 
plateau-shaped  functions,  consonant  with  our  expectation  that  observers 
would  be  able  to  use  the  relatively  efficient  relational  strategy.  The 
large  peaks  associated  with  use  of  the  rotational  strategy  have 
disappeared.  The  results  demonstrate  that  display-control 
incompatibility  can  be  reduced  by  appropriate  coding  of  the  orientation 
of  objects  within  the  display,  and  show  that  it  is  very  important  to 
desymmetrize  any  object  presented  in  a  3-D  display,  to  permit  operators 
to  determine  orientation  directly  from  the  displayed  object  without 
having  to  resort  to  the  potentially  slow  and  inaccurate  rotational 
strategy. 
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Figure  12:  Schematic  diagram  of  the  stimulus  cube  for  a  typical 
trial  in  Experiment  6.  The  orientation  cue  is  a  capital  letter 
V  oriented  as  in  Experiment  1,  but  with  an  extra  serif  attached 
to  the  top  of  its  left  upright. 


Si:<oe  the  images  displayed  on  SpaceGraph  are  generated  from 
abstract  data  in  the  computer,  it  is  up  to  the  applications  designer  to 
choose  representations  or  icons  of  the  objects  to  be  displayed.  For 
example,  two  options  in  an  Air  Traffic  Control  application  would  be  to 
display  aircraft  positions  in  an  integrated  plan-  and  vertical- 
situation  display  (1)  as  points,  or  (2)  as  miniature  aircraft  icons. 

In  the  former  case,  heading  information  might  be  indicated  by  a  wake 
extending  behind  the  aircraft,  or  might  not  be  indicated  in  the  display 
at  all.  Even  with  the  wake,  it  would  take  some  time  before  a  sudden 
turn  would  be  clearly  visible  in  the  wake,  if  the  scale  were  email.  On 
the  other  hand,  if  a  small  aircraft  icon  were  placed  at  the  appropriate 
point  in  the  display,  the  attitude  of  the  icon  would  show  the 
aircraft's  present  heading  directly,  and  a  sudden  turn  would  be  seen 
immediately  as  a  discrepancy  between  the  attitude  of  the  icon  and  the 
wake.  The  representations  that  are  chosen  will  have  large  effects  on 
how  easy  the  display  is  to  interpret  and  use.  Our  experiments  have 
shown  that,  in  an  application  where  orientation  of  the  displayed 
objects  is  important,  the  icon  chosen  should  deliberately  be  made 
asymmetric  on  all  its  major  axes.  In  a  left/right  symmetrical  icon, 
such  as  an  aircraft,  the  right  side  should  be  made  different  from  the 
left  side,  for  example  by  filling  in  that  half  of  the  icon,  or  by 
adding  a  vertical  fin  at  the  end  of  the  right  wing.  This  addition  will 
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Figure  13:  Mean  reaction  times  in  the  Y-axis  task  for  each  of 
the  five  response  buttons,  as  a  function  of  the  orientation  of 
the  stimulus  cube,  using  an  asymmetric  capital  V  almost 
touching  the  front  edge  of  the  bottom  face  as  orientation  cue. 


permit  the  operator  to  U3e  a  relational  strategy  to  determine  the 
orientation  of  the  icon,  which  we  have  seen  is  much  more  efficient  than 
a  rotational  strategy. 

A  second  important  point  is  that  even  the  relatively  efficient 
relational  strategy  is  not  as  effective  as  a  spatially-based  strategy, 
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where  the  conditions  are  appropriate  for  that  to  be  applied.  This 
suggests  that  it  might  be  beneficial  to  give  the  operator  controls  that 
would  allow  him  to  rotate  the  display  space  into  a  preferred 
orientation.  Unfortunately,  in  our  present  implementation  of 
SpaceGraph,  it  is  not  possible  to  generate  a  new  image  from  a  different 
viewpoint  fast  enough  to  permit  meaningful  experiments  to  be  run  to 
test  either  this  suggestion,  or  whether  the  suggestion  applies  to  a 
multi-vehicle  control  task  as  well.  The  commercial  version  of 
SpaceGraph,  of  which  initial  deliveries  have  already  been  made,  is 
sufficiently  fast  to  support  such  experiments,  however,  and  we  hope  to 
be  able  to  work  on  then  soon. 
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STEREOPSIS  HAS  THE  "EDGE"  IN  3-D  DISPLAYS 
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Menlo  Park,  California  94025 


ABSTRACT 

This  paper  reports  the  results  of  studies  conducted  at  SRI 
International  to  explore  differences  in  image  requirements  for  depth 
and  form  perception  with  3-D  displays.  Monocular  and  binocular  sta¬ 
bilization  of  retinal  images  was  used  to  separate  form  and  depth 
perception  and  to  eliminate  the  retinal  disparity  input  to  stereopsis. 
Results  suggest  that  depth  perception  is  dependent  upon  illumination 
edges  in  the  retinal  image  that  may  be  invisible  to  form  perception, 
and  that  the  perception  of  motion-in-depth  may  be  inhibited  by  form 
perception,  and  may  be  influenced  by  subjective  factors  such  as 
ocular  dominance  and  learning. 
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As  display-system  technology  has  advanced,  we  have  learned  how 
to  present  features  to  the  human  observer  in  such  a  way  that  his  per¬ 
ceptual  processing  mechanisms  can  synthesize  the  desired  view  of  the 
world.  The  methods  we  have  chosen  have  not  always  duplicated  events 
as  they  naturally  occur.  For  example,  when  the  dimension  of  motion  was 
added  to  displays,  we  found  that,  although  the  motion  of  objects  in  the 
real  world  is  usually  continuous,  it  was  not  necessary  to  provide  con¬ 
tinuous  motion  of  display  features  for  the  observer  to  see  these  fea¬ 
tures  moving  smoothly.  A  series  of  still  pictures  presented  in  rapid 
succession  was  entirely  adequate  for  producing  the  perception  of 
continuous  motion.  Likewise,  when  color  was  added  to  the  displays,  it 
was  accomplished  by  presenting  the  observer  with  display  features 
whose  color  was  synthesized  from  a  limited  number  of  relatively  narrow- 
band  spectral  elements,  rather  than  from  the  continuous  spectral  range 
found  in  nature. 

Now  we  are  on  the  threshold  of  adding  another  dimension  to  display 
systems — depth.  To  do  this,  we  must  know  which  of  the  many  stimulus 
variables  in  the  real  world  are  important  to  the  stereopsis  mechanism, 
and  which  are  not.  To  acquire  this  knowledge,  we  sought  to  answer 
three  major  questions  about  3-D  displays:  1)  What  features  of  3-D 
displays  promote  stereopsis?  2)  Do  these  same  features  promote  form 
perception?  3)  How  can  stereoscopic  depth  perception  be  optimized? 
Ascertaining  which  stimulus  features  underlie  the  perception  of  depth 
and  form  is  the  essential  first  step  toward  optimizing  the  perception 
of  both  on  visual  displays. 

We  have  approached  this  problem  in  two  ways:  by  isolating  inputs 
to  stereopsis  and  by  systematically  reducing  the  effectiveness  of 
inputs  to  stereopsis.  This  discussion  will  concentrate  upon  the  isola¬ 
tion  of  inputs. 

To  say  that  we  isolated  inputs  to  stereopsis  could  imply  that 
depth  perception  was  stimulated  without  form  perception,  or  that  two 
members  of  the  triad  vergence,  retinal  disparity,  and  accommodation — 
were  held  constant  while  the  third  was  manipulated.  We  wish  to  imply 
both  these  situations. 

By  using  a  technique  that  stabilized  selected  features  of  one  or 
both  retinal  images,  we  were  able  to  alter  or  eliminate  form  perception 
in  one  or  both  eyes.  We  did  this  to  investigate  whether  depth  percep¬ 
tion  was  possible  without  form  perception  and  to  see  if  it  might  be 
possible  to  modify  the  features  presented  on  3-D  displays  so  as  to 
enhance  depth  perception  without  distorting  form  perception. 

To  stabilize  the  retinal  image,  we  used  an  SRI  eyetracker  and 
stimulus-deflector  system.  Without  going  into  details,  let  me  say 
that  the  eyetracker  can  determine  eye  position  dynamically  over  a  20 
to  30°  range  of  gaze  angle  to  an  accuracy  on  the  order  of  1'  of  arc. 
Horizontal  and  vertical  eye-movement  signals  from  the  eyetracker  can 
be  used  to  drive  the  mirrors  of  the  stimulus  deflector. 
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The  stimulus  deflector,  shown  diagrammatically  in  Figure  1, 
requires  a  bit  more  explanation.  An  observer  views  the  display  screen 
through  two  unity-magnification  relay  lens  pairs.  Conceptually,  each 
unity-magnification  lers  pair  reimages  the  observer's  eye  without 
magnification.  Rotation  of  a  mirror  placed  in  the  plane  of  the 
reimaged  observer's  eye  results  in  retinal-image  displacements  identi¬ 
cal  to  those  that  would  occur  if  the  observer's  eye  were  rotated  about 
this  point.  By  using  two  orthogonal  rotating  mirror  systems,  we  can 
produce  any  desired  retinal-image  motion. 


CR 


FIGURE  1  SCHEMATIC  OF  STIMULUS  DEFLECTOR 

CR,  center  of  rotation  of  eye;  L1 ,  L2,  L3,  and  L4,  multiple-element  camera  lenses; 

LP,  lens  pair;  AP,  artificial  pupil;  DS,  display  screen;  Mv,  mirror  that  rotates  the 
visual  field  vertically;  MH,  mirror  that  rotates  the  visual  field  horizontally;  M,,  fixed 
mirror;  L4,  MH,  and  mirror  M2  move  in  synchronism  to  adjust  the  optical  distance 
to  the  display  screen. 

To  stabilize  the  retinal  image  of  any  display  feature  viewed 
through  the  stimulus  deflector  system,  we  use  the  horizontal  and  verti¬ 
cal  eye-rotation  signals  from  the  eyetracker  to  drive  the  corresponding 
mirrors  in  the  stimulus-deflector  system.  When  the  gain  between  the 
two  instruments  is  set  precisely,  any  rotation  of  the  observer's  eye  is 
exactly  compensated  by  a  rotation  of  the  stimulus-deflector  mirrors. 
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resulting  in  the  elimination  of  motion  of  the  retinal  image.  In  addi¬ 
tion,  there  is  a  plane  conjugate  to  the  observer's  retina  between  the 
lenses  of  the  first  relay  pair.  Because  this  image  plane  is  proximal 
to  either  movable  mirror,  stimuli  placed  in  this  image  plane  will 
remain  unstabilized  (i.e.,  normal).  Typically,  fixation  points  and 
sharply  focused  vertical  field  stops,  which  I'll  refer  to  as  occluders, 
are  placed  in  this  unstabilized  image  plane. 

The  stimulus-deflector  mirrors  can  also  be  driven  by  sources  other 
than  the  eyetracker.  For  example,  in  most  of  our  studies  of  the  per¬ 
ception  of  motion- in-depth,  we  presented  the  observer  with  stereo 
image  pairs  that  oscillated  horizontally  in  sinusoidal  antiphase,  i.e., 
alternately  toward  and  away  from  one  another.  This  motion  was  imparted 
to  the  stimuli  by  driving  the  stimulus  deflector  mirrors  with  sinewave 
generators. 

One  of  the  display  systems  we  used  is  shown  in  Figure  2.  It  is  a 
rear  projection  system  that  can  present  either  a  monocular  image  or 
binocular  mirror  images  of  slides  inserted  in  the  projector.  The  sys¬ 
tem  can  also  be  used  simply  to  illuminate  the  rear  projection  screen. 


FIGURE  2  SCHEMATIC  OF  DISPLAY  SYSTEM 

ND,  neutral  density  filter;  PR,  rota’able  plane  polarizer;  Pv.  vertical 
plane  polarizer;  PH,  horizontal  plane  polarizer;  M,  and  M2,  front 
surface  mirrors;  P1  and  P2,  prisms. 
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In  this  illumination  configuration,  transparent  gelatin  filters  or 
opaque  figures  can  be  affixed  to  the  screen,  presenting  the  observ-  * 
with  sharply  defined  luminance  or  chrominance  patterns. 

During  the  course  of  our  investigations,  several  response  modes 
were  used  to  indicate  perceptual  changes.  Our  earliest  studies  were 
phenomenological  in  nature;  observers  made  verbal  reports  of  their  per¬ 
ception  of  the  stimuli.  Later  we  began  to  quantify  our  results,  having 
the  observers  respond  by  changing  the  positions  of  switches  to  indicate 
changes  in  perceptual  state.  Eye  movements  were  monitored  by  the  eye- 
tracker  during  all  these  studies,  and  the  eyetracker  signals  were  fre¬ 
quently  recorded  on  a  four-channel  recorder.  An  example  of  eye-movement 
records  and  subject  responses  will  be  presented  later. 

Now  let  me  tell  you  some  of  our  observations.  From  the  very  first 
observations  of  stabilized  images,  it  was  apparent  that  filling-in  was 
occurring  across  the  stabilized  areas.  Filling-in  is  a  very  common 
phenomenon  that  seems  to  occur  whenever  an  insensitive  patch  of  retina 
is  contiguous  with  a  sensitive  patch.  For  example,  perception  is 
filled  in  across  your  blind  spot  and  across  small  scotomas.  To  the 
extent  that  the  disappearance  of  stabilized  images  mimics  suppression, 
we  may  assume  that  filling-in  also  occurs  across  suppressed  areas  of 
the  visual  field.  Knowledge  of  the  origin  of  the  percepts  seen  in  sup¬ 
pressed  areas  may  be  beneficial  in  deciding  which  features  to  present 
in  3-D  display  systems  and  how  best  to  present  them.  The  sequence  of 
results  that  I  will  present  next  will  show  the  development  of  our 
understanding  of  the  phenomenon  of  filling-in  and  of  the  effects  of 
edges,  both  perceived  and  invisible,  on  this  process.  I  will  then 
discuss  how  these  edges  affect  form  and  depth  perception. 

One  of  our  earliest  observations  involved  the  simple  stimulus 
shown  in  Figure  3.  It  was  a  vertical  black  stripe  on  a  white  back¬ 
ground,  which  the  observer  viewed  monocularly.  This  figure  and  subse¬ 
quent  figures  do  not  accurately  portray  the  observer’s  field  of  view. 

The  circular  field  of  view  subtended  approximately  25°  and  the  outer 
edges  were  very  blurred,  being  formed  by  the  circular  lens  holder  that 
was  located  well  within  the  observer's  near  point.  Also,  the 
horizontal-line  pair  drawn  across  the  edge  was  not  on  the  stimulus. 

They  are  the  symbols  we've  adopted  to  indicate  that  that  edge  is  sta¬ 
bilized.  Edges  in  the  field  of  view  that  do  not  show  the  horizontal¬ 
line  pair  are  unstabilized  edges. 

After  viewing  the  black  bar  shown  here  for  a  few  seconds,  the 
observer  found  that  it  disappeared  because  it  was  stabilized,  and.  the 
observer  then  saw  a  uniform  gray  field.  However,  if  the  stabilized 
black  bar  was  flanked  by  a  pair  of  unstabilized  occluders  such  as 
those  shown  in  Figure  4,  then  the  observer  saw  a  white  field  when  the 
stabilized  black  bar  disappeared,  rather  than  tne  gray  field  seen  in 
the  previous  experiment.  We  interpret  tht se  results  to  mean  that  the 
lightness  of  the  stabilized  field  of  vi<-  is  determined  by  the 
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FIGURE  3  A  STABILIZED  BLACK  BAR  ON  A  WHITE  BACKGROUND 

luminance  contrast  at  the  edgas  bounding  the  uniform  fipM.  In  the 
absence  of  the  occluders,  the  contrast  gradient  at  the  .ge  of  the 
field  was • shallower  and  further  in  the  periphery,  res'  *.ing  in  the  per¬ 
ception  of  the  uniform  gray  field.  When  the  occluder _  were  introduced, 
a  much  steeper  luminance-contrast  gradient  was  present  in  the  near 
periphery,  nhe  high-contrast  edges  made  the  field  appear  white. 

We  then  turned  our  attention  to  chromatic  stimuli.  Initially,  we 
created  stimuli  without  controlling  the  luminous  component;  Figure  5 
shows  one. of  our  early  chromatic  stimulus  configurations.  It  consisted 
of  a  stabilized  green  disk  on  a  red  background  whose  angular  subtense 
was  limited  to  12°  by  a  circular  unstabilized  black  mask.  When  the 
stabilized  green  disk  disappeared,  the  observer  reported  seeing  a  uni¬ 
form  circular  red  field.  We  next  presented  the  observer  with  stimuli 
such  as  those  shown  in  figure  6,  consisting  of  the  same  stabilized 
green  disk  on  a  red  background,  but  now  with  the  red  background  extend¬ 
ing  to  the  limits  of  the  viewing  field  and  having  diffuse  edges.  In 
the  center  of  the  stabilized  green  disk,  we  positioned  a  circular 
unstabilized  black  mask.  Under  these  conditions,  when  the  stabilized 
red/green  boundary  disappeared,  the  observer  reported  seeing  the  cen¬ 
tral  black  mask  on  a  uniform  green  field.  In  both  of  the  preceding 
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FIGURE  4  A  STABILIZED  BLACK  BAR  ON  A  WHITE  BACKGROUND 
WITH  FLANKING  UNSTABILIZED  OCCLUDERS 

chromatic  experiments,  the  chromatic  appearance  of  the  field  was  in 
agreement  with  the  chromatic  contrast  at  the  only  perceptible  edge  in 
the  field. 

One  of  our  most  interesting  chromatic  studies  involved  a  stimulus 
configuration  whose  only  perceptible  edges  contained  conflicting 
chromatic  information.  The  stimulus  configuration  is  similar  to  that 
shown  in  Figure  7.  It  consisted  of  a  vertically  divided  bipartite 
field,  green  on  the  left  and  red  on  the  right,  framed  by  a  pair  of 
vertical  unstabilized  black  occluders.  The  boundary  between  the  red 
and  green  halves  of  the  field  was  stabilized.  From  the  results  of  our 
previous  stabilized  chromatic  experiments  we  predicted  that  the  chro¬ 
matic  contrast  at  the  left  unstabilized  occluder  would  make  the  field 
appear  uniformly  green,  and  simultaneously  the  chromatic  contrast 
present  at  the  right  unstabilized  occluder  would  make  the  field  appear 
uniformly  red.  It  is  impossible  to  produce  a  slide  showing  the 
observer's  perceptions  of  this  stimulus.  Numerous  observers  reported 
the  field  as  reddish-green,  greenish-red,  or  something  that  they  recog¬ 
nized  as  a  color  but  had  never  seen  before  and  for  which  they  had  no 
name.  The  anticipated  conflict  was  evident  indeed. 

Back  in  the  world  of  black  and  white,  we  conducted  several  quanti¬ 
tative  experiments  to  investigate  the  effects  of  perceptible  edges 
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FIGURE  5  A  STABILIZED  GREEN  DISC  ON  A  RED  BACKGROUND 


on  lightness  perception.  The  stimulus  configuration  we  used  is  shown 
in  Figure  8.  It  consisted  of  stabilized  black  and  white  backgrounds, 
each  showing  a  single  unstabilized  gray  target  square.  The  black  and 
white  backgrounds  were  surrounded  by  a  uniform  unstabilized  gray  sur¬ 
round  of  the  same  luminance  as  the  two  target  squares.  As  long  as  the 
black  and  white  backgrounds  were  visible  as  such,  the  lightness  of  the 
two  target  squares  appeared  to  be  the  same.  After  disappearance  of  the 
stabilized  black  and  white  backgrounds,  tie  observer  perceived  a  uni¬ 
form  gray  field  upon  which  were  positioned  a  black  target  square  and  a 
white  target  square.  The  results  consistently  indicated  that  the 
lightness  of  the  unstabilized  target  squares  changed  when  the.  percep¬ 
tion  of  the  background  changed  from  black  ard  white  to  gray.  The  point 
here  is  that  there  is  no  change  in  retinal  illumination,  but  there  is 
a  change  in  lightness.  Perceived  lightness  of  the  target  squares 
varied  with  the  perceived  brightness  contrast  of  these  target  squares 
with  their  background,  not  with  the  retinal  illuminance  contrast  of 
these  target  squares  with  their  background. 

The  chromatic  analogues  of  these  experiments  yielded  equally 
dramatic  results.  They  indicated  that  perceived  color  contrasts  rather 
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FIGURE  6  A  STABILIZED  GREEN  DISC  ON  A  RED  BACKGROUND 
WITH  AN  UNSTABILIZED  CENTRAL  BLACK  MASK 


than  retinal  wavelength  contrasts  determine  the  color  of  unstabilized 
target  squares  seen  against  stabilized  chromatic  backgrounds. 

We  conclude  from  the  foregoing  that  the  color  and  lightness  of 
objects  is  determined  by  the  chrominance  and  luminance  contrast  at  the 
perceived  edges  cl  these  objects,  not  by  the  contrast  at  the  retinal- 
image  edge's  of  tne  objects. 

Although  the  visual  system  frequently  suppresses  information 
about  edges  imaged  on  one  or  the  other  retina,  for  example,  to  obtain 
a  single  binocular  image,  it  is  nonetheless  capable  of  using  that  sup¬ 
pressed  retinal-edge  information  to  produce  depth  perception.  There¬ 
fore,  we  natural.'.^  asked  whether  stabilized  edges,  like  suppressed 
edges,  might  also  be  capable  of  supporting  depth  perception.  If  this 
were  the  case,  then  it  might  be  appropriate  to  reconsider  what  informa¬ 
tion  should  be  displayed  on  three-dimensional  display  systems. 

Because  of  suggestions  in  the  literature  about  the  different  i 

effects  of  chromatic  and  achromatic  information  on  depth  perception, 
we  produced  our  stimuli  very  carefully  to  eliminate  chromatic  gradients 
from  our  luminance  experiments  and  luminance  gradients  from  our  chro¬ 
matic  experiments.  One  of  the  simpler  stimuli  we  used  consisted  of  a 
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FIGURE  7  A  STABILIZED  RED/GREEN  BOUNDARY 

vertical  black  bar  on  a  white  background,  which  the  observer  viewed 
with  both  eyes.  The  image  of  the  black  bar  was  stabilized  on  one 
retina  and  unstabilized  on  the  other.  The  image  of  the  unstabilized 
bar  could  move  laterally  back  and  forth.  Before  stabilizing  the  sta¬ 
tionary  bar,  the  observer  positioned  the  two  bars  so  that  they  appeared 
as  a  single  fused  bar  in  the  center  of  his  field  of  view.  We  then 
moved  the  image  of  one  of  the  bars  laterally  back  and  forth,  and  the 
observer  saw  a  single  fused  bar  moving  in  depth  along  a  diagonal  path. 
Next,  we  stabilized  the  image  of  the  stationary  bar  to  disappearance 
and  moved  the  image  of  the  other  bar  laterally  as  before.  Under  these 
conditions,  the  observer  still  saw  the  black  bar  moving  in  depth  along 
a  diagonal  pathway  even  though  there  was  no  bar  perceptible  to  one  of 
his  eyes.  Depth  was  maintained  in  the  absence  of  binocular  torn  per¬ 
ception. 

When  we  performed  the  chromatic  analogue  of  this  experiment,  using 
isoluminant  chromatic  stimuli,  we  found  that  only  a  weak  motion-in- 
depth  percept  was  generated  when  both  stimuli  were  unstabilized,  and 
that  when  one  stimulus  was  stabilized,  no  raotion-in-depth  percept  w3s 
produced.  Isoluminant  chromatic  stimuli  gave  only  weak  depth  informa¬ 
tion  at  best. 
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FIGURE  8  UNSTABILIZED  GRAY  SQUARES  ON  STABILIZED 
BLACK  AND  WHITE  BACKGROUNDS 

We  proceeded  to  try  to  quantify  the  depth  percept  obtained  with 
the  achromatic  stabilized  stimulus.  In  the  experiment,  each  of  the 
observer's  eyes  viewed  a  vertical  luminance  sine-wave  grating  similar 
to  that  shown  in  Figure  9.  The  grating  might  be  any  one  of  nine 
spatial  frequencies,  but  the  observer  was  always  presented  with  two 
\ gratings  of  the  same  spatial  frequency.  One  of  the  gratings  was  sta¬ 
tionary  and  of  fixed  contrast,  and  the  other  moved \laterally  back  and 
iorth  and  was  of  variable  contrast.  When  the  contrast  of  the  movable 
g'rating  was  high  enough,  the  observer  saw  a  single  fused  grating  mov¬ 
ing  in  depth  along  a  diagonal  path  When  the  contrast  of  the  moving 
giating  was  too  low  to  support  stereopsis,  the  observer  saw  either  t.he 
variable  grating  moving  laterally,  or,  at  very  low  contrasts,  just  the 
stationary  grating.  We  measured  the  contrast-sensitivity  function  for 
the  perception  of  lateral  motion  and  of  motion-in-depth  for  the  grating 
as- a  function  of  the  spatial  frequency  of  the  grating  under  unstabilized 
i 
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FIGURE  9  A  SINE  WAVE  GRATING 

and  stabilized  conditions.  Figure  10(a)  shows  data  from  one  of  our 
observers  under  unstabilized  conditions.  As  you  might  expect,  the  con¬ 
trast  required  for  detecting  motion- in-depth  is  higher  than  the  contrast 
required  for  detecting  the  lateral  motion  of  the  grating.  In  Figure 
10(b)  we  see  the  data  from  the  same  subject  under  conditions  where  the 
stationary  grating  was  stabilized  to  disappearance.  Under  this 
stabilized-image  condition,  the  observer  required  less  contrast  in  the 
moving  grating  to  see  it  moving  either  laterally  or  in  depth.  To  our 
knowledge,  this  is  the  first  quantitative  evaluation  of  the  sensitivity 
of  the  stereopsis  mechanism  to  stabilized  images.  The  data  suggest 
that  the  sensitivity  of  the  motion-in-depth  perception  mechanism 
increases  when  form  perception  is  eliminated  in  one  of  the  observer's 
eyes. 


These  results  imply  that  stereopsis  (at  least  the  motion-in-depth 
system)  can  compare  the  locations  of  corresponding  edges  on  the  two 
retinas  even  if  only  one  of  these  edges  reaches  perceptual  awareness. 
Furthermore,  it  is  the  luminance  component,  rather  than  the  chromatic 
component,  of  these  edges  that  supports  stereopsis. 

These  observations  raised  some  interesting  questions  relative  to 
three-dimensional  display  systems.  For  example,  if  binocular  form 
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(a)  UNSTABILIZED  THRESHOLDS  (b)  STABILIZED  THRESHOLDS 

FIGURE  10  COMPARISON  OF  STABILIZED  AND  UNSTABILIZED  THRESHOLDS 

perception  is  not  essential  to  stereopsis,  is  it  necessary  for  three- 
dimensional  displays  to  present  two  complete  stereo  images  to  an 
observer?  Also,  if  chromatic  edges  do  not  convey  depth  information,  is 
it  necessary  for  both  stereograms  to  be  chromatic? 

The  results  of  our  monocular  stabilization  studies  prompted  us  to 
ask  whether  depth  perception  might  remain  when  form  perception  had  been 
eliminated  in  both  eyes.  To  test  this,  we  used  a  binocular  pair  of 
eyetrackers  and  stimulus  deflectors  such  as  those  shown  in  Figure  11. 
With  this  apparatus,  it  was  possible  for  us  to  stabilize  both  retinal 
images.  We  presented  the  observer's  left  eye  with  an  image  similar  to 
that  shown  in  Figure  12(a),  consisting  of  an  unstabilized  black  fixa¬ 
tion  point  and  unstabilized  black  occluders  framing  a  white  field  upon 
which  was  presented  a  stabilized  black  bar.  We  presented  a  mirror 
image  of  this  stimulus  to  the  observer's  right  eye  [Figure  12(b)]. 

Before  disappearance  of  the  stabilized  black  bar,  the  observer's  per¬ 
ception  of  the  stimulus  was  as  shown  in  Figure  13(a).  He  saw  a  single 
fixation  point  centered  in  an  aperture  that  was  produced  by  a  fused 
pair  of  occluders,  and  behind  the  aperture  at  some  distance,  he  saw  a 
single  fused  black  bar.  Upon  disappearance  of  the  stabilized  black 
bar,  the  observer's  view  consisted  of  the  fixation  point  and  occluders 
lying  in  one  plane  and  a  uniform  empty  white  plane  at  some  distance 
behind  the  occluders,  as  shown  in  Figure  13(b).  Thus,  the  depth  plane 
specified  by  the  retinal  disparity  of  the  black  bar  imaged  in  each  eye 
remained,  even  though  form  perception  of  both  black  bars  was  eliminated. 

Next,  we  provided  observers  with  a  comparison  stimulus  consisting 
of  a  movable  unstabilized  black  bar.  The  experimenter  adjusted  the 
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FIGURE  11  BINOCULAR  IMAGE  STABILIZER 

position  of  the  comparison  bar  until  the  observer  indicated  that  it  was 
in  the  plane  occupied  by  the  fixation  point  or  the  plane  occupied  by 
the  fused  black  bar.  Two  sets  of  measurements  were  made,  one  when  the 
stabilized  black  bar  was  visible,  and  the  other  when  it  was  invisible 
to  both  eyes.  Figure  14  shows  that  the  observer  perceived  two  planes 
separated  by  about  25  cm  whether  the  black  bar  was  visible  or  invisible. 
The  observers  saw  depth  without  form. 

What  we  have  learned  from  these  experiments  is  that  stereopsis, 
which  includes  both  stereoscopic  depth  perception  and  the  perception  of 
motion- in  depth,  does  not  require  form  perception.  However,  it  does 
require  luminance  edges  in  both  retinal  Images;  chrominance  edges  are 
not  sufficient. 

When  we  began  our  binocular  stabilization  studies,  we  observed  an 
interesting  phenomenon  resulting  from  our  ability  to  eliminate  retinal 
disparity  signals  to  stereopsis,  which  allowed  us  to  assess  the  effects 
of  vergence  alone.  When  we  first  presented  an  observer  with  binocular ly 
stabilized  black  bars,  the  field  of  view  did  not  include  any  fixation 
points  or  unstabilized  occluders.  Under  these  conditions,  there  were 
no  anchoring  stimuli  to  align  the  observer's  eyes.  Consequently,  when 
the  observer's  eyes  converged  spontaneously,  he  saw  the  black  bar 
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(a)  LEFT  EYE  VIEW  OF  THE  STABILIZED 

AND  UNSTABILIZEO  STIMULUS  ELEMENTS 


(b)  RIGHT  EYE  VIEW  OF  THE  STABILIZED 
AND  UNSTABILIZED  STIMULUS  ELEMENTS 


FIGURE  12  OBSERVER'S  VIEW  OF  STIMULUS  ELEMENTS 


(a)  BEFORE  DISAPPEARANCE 
OF  THE  BLACK  BAR 


(b)  AFTER  DISAPPEARANCE 
OF  THE  BLACK  BAR 


FIGURE  13  OBSERVER'S  VIEW  OF  THE  STIMULUS  ELEMENTS 


advancing,  and  when  they  diverged,  he  saw  the  black  bar  receding.  This 
perceptual  motion-in-depth  must  have  resulted  from  vergence  inputs  to 
stereopsis,  because  changes  in  retinal  disparity  were  eliminated  by 
stabilizing  both  retinal  images.  We  also  found  that  the  magnitude  of 
the  perceived  motion-in-depth  was  much  greater  when  the  binocularly 
stabilized  images  of  the  black  bar  were  shifted  on  each  retina,  to  be 
imaged  with  static  crossed  retinal  disparity,  than  when  they  were  each 
imaged  across  the  fovea,  so  as  to  have  no  static  retinal  disparity. 

In  our  binocular  experiments,  we  also  observed  some  size  distor¬ 
tion  accompanying  the  motion- in-depth  distortion.  When  an  object 
appeared  to  approach,  it  also  appeared  to  get  smaller,  because  the 
size  of  its  retinal  image  did  not  change.  These  results  show  us  that 
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FIGURE  14  PERCEIVED  DISTANCE  TO  FIXATION  POINT  AND  PLANE  OF  STRIPE 

the  position  of  edges  in  the  retinal  image  affect  such  factors  as  the 
efficacy  of  vergence  signals  to  stereopsis  and  size  constancy.  It  is 
important  that  we  understand  these  edge-position  effects  when  producing 
three-dimensional  displays,  particularly  because  we  do  not  know  where 
the  observer  will  be  looking  at  any  given  instant. 

Let  us  now  leave  the  selectively  stabilized  image  technique  and 
examine  some  changes  in  the  perception  of  motion-in-depth  that  occur 
when  the  stimulus  parameters  of  three-dimensional  displays  are  varied. 
Using  the  psychophysical  method  of  limits,  we  asked  how  much  corres¬ 
ponding  edges  in  three-dimensional  display  systems  can  be  changed  along 
various  display  dimensions  before  stereopsis  suffers. 

In  the  technique  we  used  to  address  this  question,  observers 
viewed  mirror  images  of  geometric  targets  through  the  binocular 
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stimulus-deflector  system.  The  horizontal  deflection  mirrors  of  the 
two  stimulus-deflector  systems  oscillated  sinusoidally  in  antiphase. 

This  moved  the  two  retinal  images  alternately  nasally  and  temporally, 
producing  a  compelling  illusion  of  motion-in-depth  of  the  single  fused 
target.  The  observer's  task  was  to  decide  when  the  target  appeared  to 
move  exclusively  in-depth,  ambiguously,  or  exclusively  laterally.  We 
recorded  the  observer's  subjective  response  to  the  motion  of  the  stimu¬ 
lus  along  with  a  record  of  the  motion  of  the  stimulus  and  a  record  of 
the  observer's  eye  movements. 

Some  of  the  parameters  we  varied  were  the  luminance  contrast 
between  the  target  and  its  background,  the  luminance  level  of  the  target, 
and  interocular  contrast.  Conceptually,  interocular  contrast  is  the 
ratio  between  the  luminances  of  the  two  monocularly  perceived  targets 
of  the  stereogram.  Interocular  contrast  has  been  shown  to  be  associated 
with  perceptual  distortions,  such  as  the  Pulfrich  phenomenon.  In  a 
typical  experiment,  observers  were  seated  before  the  binocular 
eye tracker /stimulus-deflector  system  and  viewed  targets  having  a  mean 
luminance  of,  say,  3.0  foot  lamberts,  and  having  a  contrast  with  their 
surround  of,  say,  60  percent.  Starting  from  a  condition  in  which  the 
luminance  of  the  left  and  right  targets  was  the  same,  and  the  observer 
saw  the  single  fused  target  moving  in  depth,  we  varied  the  luminance  of 
the  left  and  right  targets  inversely,  so  that  as  one  got  brighter  the 
other  got  dimmer.  Eventually  the  luminance  difference  between  the  two 
targets  was  great  enough  that  at  times  the  observer  did  not  see  the 
single  fused  target  simply  moving  unambiguously  in  depth,  but  rather  he 
saw  a  combination  of  lateral  motion  and  motion- in-depth.  As  the  inter¬ 
ocular  contrast  ratio  between  the  two  targets  continued  to  increase,  the 
observer  saw  only  lateral  motion  of  the  targets.  During  the  experiment, 
the  observer  indicated  the  type  of  motion  he  was  seeing,  motion-in¬ 
depth,  ambiguous  (both  lateral  and  depth),  or  lateral  motion  only.  An 
example  of  data  collected  in  these  experiments  is  shown  in  Figure  15. 
During  periods  of  versional  eye  movements  the  observer  tended  to  see 
the  target  moving  laterally,  whereas  during  periods  when  he  made  large- 
amplitude  vergence  movements,  he  reported  seeing  the  target  moving  in 
depth. 

Several  interesting  results  were  obtained  from  this  study.  The 
first  was  that  for  contrasts  greater  than  20  percent,  the  contrast  of 
the  stimuli  was  not  a  primary  factor  in  determining  an  observer's  per¬ 
ception  of  motion- in-depth.  Target  luminance  in  the  range  1.5  to  3.0 
foot  lamberts  was  also  not  a  primary  factor.  Three  factors  were  found 
that  substantially  influenced  the  observer's  perception  of  motion  in 
depth.  These  were  the  interocular  contrast  of  the  target,  the  ocular 
dominance  of  the  observer,  and  the  previous  experiences  of  the  observer 
with  three-dimensional  displays. 

Figure  16  shows  the  interocular  contrast  ratios  at  which  trained 
and  naive  observers  saw  the  stimulus  target  moving  exclusively  in 
depth,  ambiguously,  or  laterally  when  the  stimulus  target  had  a  mean 
luminance  of  1.5  foot  lamberts.  The  data  points  in  this  and  the 
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FIGURE  15  PHYSIOLOGICAL  AND  SUBJECTIVE  MEASURES  OF  OBSERVER  RESPONSES 
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INTEROCULAR  CONTRAST  RATIO 


FIGURE  16  TRANSITION  INTEROCULAR  CONTRAST  RATIOS  OF  TRAINED  AND  NAIVE 
OBSERVERS  FOR  THE  PERCEPTION  OF  MOTION  IN  DEPTH,  AMBIGUOUS 
MOTION,  AND  LATERAL  MOTION  OF  A  1.5-fL  STIMULUS 

following  figure  are  the  mean  interocular  contrast  ratios  at  which*  for 
a  given  contrast  of  the  stimulus,  an  observer's  perception  of  object 
motion  changed,  for  example  from  motion- in-depth  to  ambiguous  motion. 
The  connecting  lines  are  the  limits  of  one  type  of  motion  perception, 
for  example  ambiguous  motion.  Notice  on  tne  right-hand  side  of  the 
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figure  that  at  very  high  inter ocular  contrast  ratios  all  observers  saw 
the  target  moving  only  laterally.  Moving  leftward,  we  find  a  region  of 
interocular  contrast  ratios  within  which  naive  observers  saw  the  target 
moving  laterally  but  trained  observers  continued  to  see  the  target 
moving  with  some  motion-in-depth.  Moving  further  to  the  left  is  a 
region  of  interocular  contrast  ratios  at  which  all  observers  saw  the 
targets  moving  ambiguously.  Another  step  to  the  left  and  we  find  a 
region  where  naive  observers  continued  to  see  the  target  moving  ambigu¬ 
ously  but  trained  observers  saw  the  target  moving  only  in  depth. 

Finally  we  come  to  the  lowest  interocular  contrast  ratios  wherein  all 
observers  saw  the  targets  moving  in  depth. 

Figure  17  shows  the  effect  of  ocular  dominance  on  the  perception  of 
motion-in-depth.  Starting  from  the  right  side  once  again,  there  is  a 
region  of  interocular  contrast  ratios  at  which  a  right-eye-dominant 
observer  saw  only  lateral  motion,  irrespective  of  whether  the  luminance 
of  the  left  or  the  right  target  is  greater.  Moving  to  the  left,  we  find 
a  region  of  interocular  contrast  ratios  at  which  this  right-eye-dominant 
observer  continued  to  see  the  target  moving  in  depth  when  the  luminance 
was  reduced  in  his  right — that  is,  his  dominant — eye,  but  saw  only 
lateral  motion  when  the  luminance  was  reduced  in  his  left  eye.  Moving 
to  the  left  once  again,  there  is  a  region  of  interocular  contrast  ratios 
where  the  observer  saw  the  target  moving  ambiguously  irrespective  of 
whether  his  left  eye  or  his  right  eye  viewed  the  target  of  greater  limi- 
nance.  The  next  region  to  the  left  is  a  region  in  which  the  observer 
saw  only  motion- in-depth  when  luminance  was  reduced  in  his  dominant  eye 


FIGURE  17  TRANSITION  INTEROCULAR  CONTRAST  RATIOS  OF  THE  DOMINANT  AND 
NONDOMINANT  EYES  OF  A  RIGHT-EYE-DOMINANT  OBSERVER  FOR  THE 
PERCEPTION  OF  MOTION  IN  DEPTH,  AMBIGUOUS  MOTION,  AND  LATERAL 
MOTION  OF  A  1.5-fL  STIMULUS 
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but  ambiguous  motion  when  the  luminance  was  reduced  in  his  nondominant 
eye.  Finally,  at  the  lowest  interocular  contrast  ratios  is  a  region 
where  an  observer  saw  the  target  moving  only  in  depth  irrespective  of 
which  eye  saw  the  higher-luminance  target- 

It  is  important  to  note  that  under  conditions  where  observers 
first  saw  the  target  moving  only  laterally,  this  was  not  because  the 
lower- luminance  target  was  too  dim  to  be  perceived.  Most  often,  the 
observer  could  see  both  targets,  but  they  both  appeared  to  be  moving 
laterally.  This  may  indicate  that  the  form-perception  threshold — that 
is,  the  threshold  for  the  perception  of  a  target  edge — is  lower  than 
the  depth  perception  threshold  for  that  same  target  edge. 

The  observation  that  ocular  dominance  also  appears  to  affect  the 
threshold  for  the  perception  of  motion- in-depth  is  particularly  inter¬ 
esting.  We  usually  assume  that  ocular  dominance  is  a  phenomenon  that 
affects  only  form  perception.  However,  these  data  indicate  that  the 
sensitivity  of  both  the  form-perception  mechanism  and  the  depth- 
perception  mechanism  may  be  affected  by  ocular  dominance. 

Finally,  it  is  interesting  to  note  that  familiarity  with  three- 
dimensional  display  systems  appears  to  alter  the  threshold  for  the  per¬ 
ception  of  depth,  but  not  for  the  perception  of  form.  One  reason  may 
be  that  depth  cues  presented  in  three-dimensional  displays  are  different 
from  those  normally  used  by  the  stereopsis  mechanism.  This  mechanism 
may  require  some  retraining  to  use  these  unusual  depth  cues  adequately. 

To  summarize  the  results  I  have  presented  here,  I  should  like  to 
return  to  the  analogy  I  drew  at  the  beginning  of  this  discussion.  The 
color  and  motion  of  images  on  two-dimensional  displays  are  perceptions 
generated  by  unnatural  inputs  to  the  visual  system.  By  understanding 
the  processes  by  which  the  visual  system  arrives  at  the  perception  of 
motion  and  color,  we  have  been  able  to  reproduce  these  percepts  by 
artificial  means. 

We  do  not  yet  have  adequate  information  about  how  the  visual  system 
synthesizes  the  third  dimension  to  allow  us  artificially  to  manipulate 
the  input  to  the  stereopsis  mechanism.  Our  studie  strongly  suggest 
that  form  perception  and  depth  perception  have  different  image  require¬ 
ments.  In  our  ongoing  research  program  we  are  studying  means  by  which 
we  may  exploit  the  different  requirements  for  form  perception  and  depth 
perception  to  efficiently  produce  a  realistic  impression  of  both  form 
and  depth  on  3-D  displays. 


74 
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ABSTRACT 

Graphic  displays  can  provide  accurate  representations  of  three-dimensional  space 
only  if  they  are  viewed  from  the  geometric  center  of  projection.  Other  viewing 
conditions  result  in  distortions  of  virtual  space.  A  current  paradox  of  graphic  display 
perception  is  that  such  distortions  are  not  always  evident  in  perception  of  depicted 
space. 

This  paper  presents  an  analysis  of  the  geometric  basis  for  distortions  of  the 
virtual  space  depicted  in  pictorial  displays.  Recent  experiments  are  summarized 
which  define  the  conditions  under  which  geometric  distortions  affect  perceived 
space.  Under  some  conditions,  an  active  perceptual  compensation  process  exists 
which  discounts  the  compression  and  expansion  of  virtual  space.  In  addition, 
regularity  or  familiarity  of  the  viewed  object  greatly  reduce  the  sensitivity  to 
distortion  of  spatial  information. 
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Iatrodaction 

The  work  that  I  will  discuss  today  is  directed  toward  a  fundamental  issue  in  the  study  of 
Visual  Perception,  and  in  the  application  of  perceptual  studies  to  the  design  of  graphic  displays. 
Specifically,  what  is  the  relationship  between  visual  stimulation  or  visual  information,  on  the 
one  hand,  and  perceptual  experience  on  the  other.  This  is  a  fundamental  question,  that  one 
would  have  hoped  could  have  been  settled  long  ago,  but  this  is  not  the  case.  In  the  area  of 
space  perception,  for  example,  there  is  little  agreement  regarding  the  extent  to  "  hiefc  the 
characteristics  of  the  visual  array  projected  to  the  eye  determine  the  nature  of  perceived 
experience. 

Wnen  one  considers  the  perception  of  space  represented  in  pictures,  these  issues  are 
relevant  to  both  a  theoretical  psychological  and  an  applied  engineering  perspective.  From  the 
standpoint  of  perceptual  theory,  the  basic  nature  of  picture  perception  has  beer-  ambiguous. 
Originally,  Gibson  (1951)  and  many  of  his  colleagues  interpreted  the  phenomena  of  picture 
perception  as  evidence  'or  a  direct  theory  of  perception.  Individuals  were  able  to  make  accurate 
judgments  of  depth  represented  in  pictures;  and  there  was  a  suggestion  that  under  the  right 
conditions,  observers  were  unaware  that  they  had  been  viewing  pictures.  The  interpretation  for 
such  results  was  that  the  array  projected  to  the  eye  from  the  picture  was  identical  to  the  array 
from  the  real  world.  Geometrically,  the  information  was  the  same  in  the  two  cases.  Therefore, 
the  same  processes  which  were  involved  in  the  pick-up  of  information  from  the  real  world 
could  be  used  to  pick  up  the  information  projected  from  a  photograph.  Pictures  acted  as 
informational  surrogates  for  actual  spatial  layouts.  Considerable  evidence  was  accumulated 
regarding  the  equivalence  of  pictures  and  real  scenes,  and  this  surrogate  theory  of  picture 
perception  was  perhaps  the  most  influential  over  the  last  two  decades. 

There  are  substantial  problems  with  such  a  view  that  are  fairly  easy  to  point  out.  There  is 
a  geometric  isomorphism  between  the  pictorial  and  environmental  arrays  only  when  a  picture  is 
viewed  from  the  geometrically  correct  center  of  projection.  When  a  picture  is  viewed  from 
some  other  place,  the  geometric  relations  are  changed;  the  space  specified  by  the  picture  is 
distorted  in  the  sense  that  it  does  not  correspond  to  the  actual  scene  that  was  depicted.  Now,  if 
space  perception  in  pictures  were  simply  and  directly  based  on  the  information  projected  to  the 
eye,  such  distortions  should  be  evident  in  perceived  space.  Our  impressions  and  judgments  of 
space  should  be  similarly  distorted.  But  this  does  not  occur.  P  ?ured  space  does  not  seem  to 
distort  when  we  walk  past  a  picture;  we  are  usually  unaware  of  the  distortions  present  in  studio 
photography;  and  artists  and  photographers  have  long  known  that  it  is  often  necessary  to  distort 
perspective  to  make  a  scene  "look  right". 

In  response  to  such  difficulties  with  the  surrogate  theory,  Gibson  (1979)  later  argued  that 
picture  perception  was  very  different  from  normal  space  perception  in  that  it  was  indirect  and 
mediated  by  some  interpretive  mechanism.  Hagen  (1974)  proposed  that  picture  perception 
involved  an  entirely  different  "mode"  of  perception,  although  the  nature  of  this  mode  was  not 
specified.  Others  such  as  Pirenne  (1970)  suggested  that  there  was  a  compensation  process 
which,  in  some  way,  was  able  to  discount  the  effects  of  geometric  distortions  on  perception. 


2.  Current  Address:  Bell  Laboratories;  Lincroft,  N  I.  07738. 
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From  an  applied  perspective,  the  role  of  non-visual  processes  in  the  perception  of  space 
can  play  an  important  role  in  graphics  design.  There  has  been  increased  use  of  two-dimensional 
displays  of  three-dimensional  space  in  such  areas  as  simulation,  master-slave  robotics,  remote 
piloting  of  vehicles,  and  in  multi-variable  integrated  displays.  In  each  of  these  applications  it  is 
necessary  that  an  operator  respond  to  perceived  space  from  a  two  dimensional  display. 
Geometric  accuracy  (although  not  necessarily  realism)  has  been  an  important  aspect  of  display 
design.  The  non-visual  factors  that  affect  the  way  that  spatial  information  is  used  would  be 
important  variables  in  design  of  spatial  displays. 

The  general  questions  that  have  been  at  the  focus  of  the  research  that  I  will  discuss 
concern  the  determination  of  spatial  perception  by  the  geometry  of  the  visual  array,  a,  4  the 
nature  of  non-visual  compensation  processes  that  affect  perception  of  space  based  on  graphic 
displays.  That  is,  processes  which  can  discount  the  effects  of  projective  distortions  of  the  visual 
array.  I  will  simply  assert  here  that  there  is  no  optical  information  available  from  a  picture  or 
graphic  display  for  the  picscnce,  absence,  or  extent  of  any  projective  distortion.  Ideally,  a 
compensation  phenomena,  were  it  to  exist,  would  operate  primarily  when  distortions  existed; 
but  if  no  optica!  information  for  distortion  is  present,  how  is  the  presence  of  a  distortion 
detected? 

Early  evidence  for  a  spatial  compensation  process  is  rather  sparse,  and  many  including 
myself  doubted  its  existence.  One  investigator  (Perkins,  1973)  showed  that  shape  distortions 
were  not  perceived  until  the  projective  distortion  was  quite  extreme;  yet  these  data  might  not 
indicate  a  perceptual  compensation  as  much  as  a  failure  of  discrimination  of  shape  categories. 
A  second  irvestigator  (Hagan,  1974)  found  no  perceptual  effect  of  distortions  on  relative  depth, 
but  information  for  relative  depth  is  not  affected  by  such  distortions.  Occasionally  the 
magnitude  of  the  geometric  distortion  has  been  miscalculated,  so  conclusiors  about 
compensation  were  moot.  Finally,  many  arguments,  and  the  data  used  to  support  them  have 
be-m  intuitive  and  phenomenological.  One’s  intuition  or  awareness  is  not  relevant  here  since 
the  empirical  question  is  whether  perception  is  greater  correspondence  with  the  distorted 
projection  or  with  the  environment  that  the  picture  is  supposed  to  represent. 

Preliminary  studies  that  were  conducted  in  my  lab  (Rosinski,  Mulholland,  Degelman,  & 
Farber,  1980),  however,  provided  evidence  for  some  form  of  pictorial  compensation.  In  a  task 
requiring  judgments  of  surface  orientation  represented  in  pictures,  one  arrangement  showed  a 
close  correspondence  between  perceived  slant  and  the  distorted  projection,  a  second  showed  no 
effects  at  all  of  the  projective  distortion.  This  particular  pattern  of  results  could  only  be 
reconciled  in  terms  of  some  compensation  mechanism. 

An  initial  issue  was  to  assess  the  degree  to  which  perceived  space  corresponded  to 
distorted  space.  To  accomplish  this,  Farber  and  I  (Farber  and  Rosinski,  1978;  Rosinski  and 
Farber,  1980)  developed  a  geometrical  analysis  that  could  be  used  to  quantitatively  deteimine 
the  effects  of  projective  distortions  on  depicted  space.  We  reanalyzed  a  number  of  early  studies 
to  determine  the  extent  of  the  effects  of  distortion.  Based  on  these  findings,  a  research 
program  was  initiated  under  the  sponsorship  of  the  Office  of  NavJ  Research  to  specifically  test 
the  correspondence  between  perceived  and  geometrically  specified  space. 

The  essential  nature  cf  this  analysis  can  be  seen  in  Figure  1.  This  drawing  represents  a 
square-tiled  surface  lying  at  an  angle  i  p.  another  square-tiled  surface.  The  same  perspectival 
rules  used  to  create  such  a  drawing  can  be  used  to  analyze  distortion.  For  either  a  real  scene 
viewed  direcil; ,  or  for  a  picture  viewed  from  the  geometrically  correct  center  of  projection  a 
number  of  geometric  relations  obtain.  For  any  surface,  a  line  from  the  eye  to  the  primary 
vanishing  point  has  the  same  orientation  as  the  slant  of  he  surface.  The  angle  between  the 
lines  from  the  eye  to  the  primary  vanishing  pemt  of  one  surface,  and  the  line  from  the  eye  to 
the  primary  vanishing  point  of  the  second  surface  corresponds  to  the  angle  between  the  two 
suv 'aces.  The  angle  between  the  eye  and  the  two  vanishing  points  for  the  tiles  diagonals  should 
be  90  degrees. 

77 


Ros inski 


r 


r-  #^=5  >n',vniS  -* 


'  -?^V  £3?^ 


Figure  1.  Geometry  of  Surface  Layout. 


It  follows  from  this  sort  of  analysis  that  if  the  eye  is  positioned  at  the  correct  center  of 
projection,  the  visual  array  from  the  display  specifies  the  location  of  objects  and  surfaces  in  the 
world.  That  is,  when  the  eye  is  at  the  center  of  projection,  the  environmental  and  pictorial 
arrays  are  identical,  ?nd  the  displayed  space  corresponds  to  the  real  scene.  This  is  the  simple 
geometry  that  is  the  basis  for  linear  perspective  in  drawings  and  in  computer  graphic 
representations  of  three-dimensional  space. 

How  can  wc  characterize  the  distortions  of  space  that  result  when  the  viewing  point  is 
changed?  We  adopt  a  simple  convention.  For  any  new  viewing  point,  we  could  describe  the 
new  virtual  space  which  would  have  generated  the  new  array.  A  comparison  of  the  new  virtual 
space  with  the  the  original  virtual  space  gives  a  quantitative  index  of  the  distortion. 
Magnification  is  obtained  if  the  viewing  point  is  closer  to  the  display  that  is  the  center  of 
projection.  Magnification,  implies  a  compression  of  internal  depth,  with  slanted  surfaces 
becoming  more  frontal.  Wc  represent  magnification  and  minification  as  the  ratio  of  correct  to 
actual  viewing  points.  Thus,  if  one  views  from  one-half  the  correct  distance  the  magnification 
ratio  is  2.0;  if  one  views  from  twice  the  correct  distance,  the  magnification  ratio  is  0.5.  The 
changes  in  internal  depth  of  objects  in  the  virtual  space  corresponds  to  the  reciprocal  of  the 
magnification  ratio.  Similar  descriptions  of  virtual  space  can  be  general  d  for  lateral 
displacements  of  the  viewing  point,  lateral  displacements  of  the  viewing  point  result  in  an 
additive  combination  of  shear  and  magnification.  The  point  to  be  stressed  here  is  that  Uese 
distortions  are  not  due  to  any  particular  viewing  point,  but  rather  the  relation  between  the 
actual  and  the  correct  viewing  point. 
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Since  we  can  define  the  real  space,  can  calculate  the  virtual  space,  ami  can  record 
judgments  indicating  perceived  space,  the  psychological  question  becomes  quite  simple.  When 
does  the  perception  of  space  in  graphic  displays  correspond  to  the  geometrically  specified  space? 
Does  compensation  for  distortion  occur?  Psychophysically,  these  become  relatively  easy 
questions  to  answer. 

Before  reviewing  some  of  our  results,  let  us  consider  how  such  a  compensation 
mechanism  might  operate.  As  I  asserted  earlier,  there  is  no  optical  information  for  distortion, 
and  the  nature  or  extent  of  any  distortion  is  not  given  in  the  display.  On  what  might  a  pictorial 
compensation  be  based?  One  alternative  is  that  one  recognizes  the  objects  depicted,  and  the 
pattern  match  criteria  are  extremely  broad.  Thus  one  might  recognize  horizontal  surfaces  or 
right  angles  even  if  the  geometry  of  the  projection  did  not  correspond  to  these  spatial  details. 
A  second  alternative  is  a  much  more  active  compensation  process.  What  we  have  proposed, 
and  what  our  results  indicate  is  happening,  is  that  the  discrepancy  between  an  actual  viewing 
point  and  an  assumed  correct  viewing  point  is  evaluated,  and  is  used  to  discount  the  effects  of 
the  geometric  distortion  caused  by  the  dislocation  of  the  viewing  point. 

I  will  review  the  results  of  a  series  of  studies  which  support  this  proposal.  This  review  is 
selected  from  several  studies  in  which  we  have  examined  all  possible  distortions  of  displays  of 
static  objects  and  spatial  layout,  and  their  effects  on  perceived  slant,  depth,  internal  depth, 
height,  width.  In  addition,  we  have  explored  the  effects  of  geometric  distortions  of  these 
dimensions  of  space  on  moving  objects  and  layouts,  and  in  all  cases  a  single  pattern  of  results 
emerges. 

Distortions  of  Unfamiliar  Objects 

One  set  of  studies  have  dealt  with  the  effects  of  geometrical  distortions  on  perceived 
depth  of  unfamiliar  objects.  Magnification  or  minification  induced  by  viewing  a  display  from 
too  close  or  too  far  away  (relative  to  the  correct  center  of  projection)  causes  a  compression  or 
expansion  of  virtual  space.  We  asked  people  to  make  magnitude  estimates  of  the  internal  depth 
of  objects  depicted  on  a  CRT  screen.  The  procedure  that  was  used  was  tc  project  concentric 
irregular  five-sided  shapes.  The  corresponding  vertices  of  the  shapes  were  connected  by  lines 
to  inciease  linear  perspective  information.  The  overall  impression  was  of  looking  into  an 
irregularly  shaped  tunnel  which  receded  into  the  distance.  The  participants  were  asked  to  judge 
the  objects’  internal  depth.  The  objects  were  computer  drawn,  and  displayed  on  a  CRT  screen 
which  the  observers  viewed  while  in  a  chin  rest  to  assure  appropriate  viewing  distances.  In  the 
first  expeis  ment  the  viewing  point  for  all  conditions  was  constant  at  1 12  cm.  while  the  center  of 
projection  was  varied  to  result  in  a  range  of  distortions  of  virtual  space  equivalent  to 
magnifications  of  0.25  to  3.0. 

If  perception  of  the  displayed  space  were  determined  by  the  projection,  we  should  expect 
a  correspondence  between  perceived  space  and  the  distortion  virtual  space  specified  by  the 
display.  In  fact,  as  can  be  seen  in  Table  1,  there  was  an  extremely  close  correspondence 
between  the  actual  judgments  and  those  expected  on  the  basis  of  the  geometric  distortion.  In 
general,  internal  depth  was  accurately  perceived  when  the  CRT  screen  was  viewed  from  the 
correct  center  of  projection. 

Table  1 

Power  Functions  for  Magnification 
Viewing  Distance  Constant 


Magnification 

Coefficient 

Exponent 

0.25 

4.67 

0.58 

0.50 

1.86 

0.69 

1.00 

1.32 

0.72 

2.00 

0.60 

0  73 

3.00 

0.61 

0.70 
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A  4X  minification  resulted  in  an  expansion  of  perceived  space  by  a  factor  of  approximately  4. 
Similarly,  magnifications  resulted  in  compressions  of  perceived  space  as  expected  from  the 
induced  distortions  of  geometric  information. 

It  is  clear  from  these  results,  that  there  is  a  close  relationship  between  the  perception  of 
internal  depth  represented  in  graphic  displays,  and  the  nature  of  the  geometric  information 
provided  by  the  display.  Inducing  distortions  in  the  display  projections  results  in  regular  and 
predictable  errors  in  perception.  If  distortions  are  introduced  by  projecting  the  display  to  a 
point  other  than  the  normal  viewing  point,  corresponding  distortions  in  perception  result. 
Appropriate  choice  of  a  center  of  projection  in  designing  graphic  displays  is  crucial  for 
perceptual  judgments,  at  least  under  certain  circumstances. 

It  is  to  be  expected  that  there  would  be  a  close  relationship  between  judged  depth  and 
distortion.  Since  there  is  no  optical  information  for  distortion,  judgments  correspond  tc  that 
specified  by  available  information.  The  projective  distortions  of  magnification  and  minification 
can  be  generated  in  two  ways:  moving  the  center  of  projection  while  maintaining  a  constant 
viewing  position  as  was  done  above,  and  by  moving  the  viewing  point  while  maintaining  a 
constant  location  for  the  center  of  projection.  In  this  latter  case,  the  degree  of  magnification 
(and  of  the  expansion  or  contraction  of  perceived  space)  is  perfectly  related  to  viewing  distance. 
Under  such  conditions,  a  non-optical  basis  for  compensation  exists,  and  individuals  could,  in 
principle,  discount  the  effects  of  variation  in  viewing  point. 

To  determine  whether  such  discounting  of  distortion  occurs  within  the  context  of  the 
perception  of  unfamiliar  objects,  magnifications  ranging  from  0.33  to  4.0  were  created  by 
projecting  the  display  to  a  point  112  cm  away  from  thv  screen  while  the  display  was  viewed 
from  points  between  28  cm  and  337  cm  away.  Since  magnifications  are  related  to  th-  ratio  of 
actual  to  correct  viewing  distance,  these  viewing  conditions  result  in  projective  distortions 
equivalent  tc  those  used  in  the  preceding  experiment.  Equivalent  distortions  of  perceived 
depth  are  expected  in  perception,  in  this  case,  if  only  the  projection  affects  judgment. 

Subjects’  judgments  however,  showed  no  effect  of  the  geometric  distortions  in  this  case. 
As  shown  in  Table  2,  in  spite  of  a  twelve-fold  distortion 

Table  2 

Power  Functions  for  Magnification 


Viewing  Distance  Varied 


Magnification 

Coefficient 

Exponent 

0.33 

1.02 

0.77 

0.50 

1.12 

0.74 

1.00 

1.09 

0.76 

2.00 

1.04 

0.78 

4.00 

1.17 

0.72 

of  virtual  space  induced  by  the  geometric  distortion,  there  is  no  effect  demonstrated  in 
perceived  depth;  power  function  coefficients  are  constant.  These  data  conclusively  demonstrate 
that  compensation  for  the  distorting  effects  of  magnification  occurs  when  the  distortions  are 
caused  by  moving  the  viewing  point,  but  not  when  equivalent  distortions  are  caused  by  moving 
the  center  of  projection.  Since  the  distortions  are  discounted  only  when  the  distortions  are 
correlated  with  viewing  distance,  we  have  suggested  that  a  comparison  between  the  actual 
distance  and  some  assumed  correct  or  standard  distance  froms  the  basis  for  compensation. 

Effects  of  Familiarity 

It  is  clear  that  individuals  can  actively  discount  the  distortirg  effects  of  projective 
transformations  of  displayed  objects.  The  cc-.nmonly  reported  inability  of  individuals  to  notice 
such  distortion  seems  to  be  due  to  some  additional  factor.  Under  some  conditions  people  do 
not  appear  to  notice  that  a  distortion  is  present.  We  distinguish  this  from  a  more  active 
compensation  because  some  failure  to  discriminate  or  loss  of  sensitivity  seems  to  occur. 
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Perceptual  judgments  of  spatial  layouts  can  involve  two  different  activities.  One  is  the 
registration  and  processing  of  projective  geometric  information.  A  second  may  aimply  involve  a 
perceptual  categorization  of  an  object.  If  something  is  categorized  as  a  cube,  judgments  of  its 
relative  dimensions  may  be  influenced  by  assumptions  concerning  known  qualities  of  the  object. 

To  explore  such  an  effect,  further  experiments  were  conducted  that  were  analogous  to  the 
ones  discussed  above.  A  series  of  rectangular  solids  with  equal  length  and  width  were  created. 
The  stimulus  objects  were  subjected  to  two  Euhler  transforms  so  that  the  two  sides  were  at  a  45 
degree  angle  to  the  screen,  and  the  top  was  at  10  degrees  relative  to  the  screen.  Such  an 
arrangement  gives  good  3-point  perspective.  In  one  experiment,  the  subjects  viewed  the  screen 
from  a  distance  of  112  cm.  while  the  objects  were  displayed  with  centers  of  projection  ranging 
from  28  cm  to  450  cm.  These  relations  give  magnifications  which  result  in  distortions  of  virtual 
space  of  from  0.25  to  4.0.  The  observation  conditions  were  identical  to  those  described  in  the 
first  experiment  above  which  resulted  in  large  distortions  of  judgment. 

In  contrast,  judgments  of  the  internal  depth  of  the  regular  parallelopipeds  showed  little 
effects  of  the  distortion  of  virtual  space.  Although  there  are  visible,  significant  effects  of  the 
effects  of  the  distortion  of  virtual  space,  their  size  was  an  order  of  magnitude  less  than  expected 
from  the  distortion.  Thus  is  appears  that  the  perceptual  effects  of  an  expansion  of  compression 
of  virtual  space  is  severely  restricted  when  a  familiar,  regular  target  object  is  used. 

In  a  further  experiment  using  the  rectangular  solids,  the  displays  were  projected  to  a 
constant  distance  112  cm  from  the  screen.  But  the  displays  were  viewed  from  various  distances 
that  resulted  in  expansion  or  contraction  of  virtual  space  by  factors  ranging  from  0.25  to  4.0.  In 
this  study  the  degree  of  distortion  was  directly  related  to  the  distance  from  the  subject  to  he 
display  screen.  The  range  of  the  effect  of  the  geometric  distortions  is  reduced  relative  to  the 
preceding  experiment,  and  statistically,  the  perceptual  effects  of  distortions  of  virtual  space  are 
reduced  when  the  degree  of  distortion  is  caused  by  moving  the  observer’s  viewing  point. 
However,  the  absolute  magnitude  of  this  compensation  is  extremely  small.  The  familiarity  or 
regularity  of  the  objects  renders  the  perceptual  system  quite  insensitive  to  projective  distortions. 

Insensitivity  To  Distortions 

The  results  of  the  preceding  experiments  show  that  for  regular  objects,  it  is  virtually 
impossible  for  observers  to  detect  projective  distortions  of  their  virtual  dimensions.  The  extent 
of  this  insensitivity  is  revealed  by  a  series  of  signal  detection  experiments  undertaken  to  assess 
the  sensitivity  to  geometric  distortion ,  The  method  used  was  a  modified  stair-case  scaling 
procedure.  The  rectangular  solid  described  above  under  10  different  degrees  of  distortion  were 
projected  on  a  CRT  screen.  Subjects  were  asked  to  simply  indicate  whether  the  depicted  object 
appeared  distorted  (under  various  criteria).  If  the  subject  responded  no,  the  experimental 
program  increased  the  degree  of  projective  distortion.  If  the  subject  responded  yes  they  saw 
some  distortion  on  two  successive  trials,  the  amount  of  distortion  was  decreased.  Thi.-> 
procedure  effectively  tracks  the  d’  =  0.707  point.  The  intent  was  to  compare  different 
distortions  that  corresponded  to  a  constant  value  of  d’. 

In  three  initial  experiments,  using  different  defin  tions  of  distortion,  it  was  impossible  to 
obtain  any  measure  of  d\  Magnifications  resulting  in  a  thirty-fold  compression  of  virtual  space 
were  not  reported  as  distorting  the  objects. 

To  simplify  the  task,  the  procedure  was  changed  to  a  two-alternative  forced  choice 
paradigm,  and  only  one  object  (a  cube)  was  used  in  place  of  the  series  of  rectangular  solids. 
Pairs  of  cubes  were  presented  successively.  One  was  undistorted  (i.e.  was  projected  to  the 
viewing  point),  the  other  was  determined  to  the  extent  determined  by  the  staircase  procedure. 
Using  this  procedure  is  was  possible  to  make  a  crude  estimate  of  sensitivity.  The  average  value 
of  distortion  which  corresponded  to  a  d’  of  0.707  was  magnification  equal  to  2.8  for 
compression  of  space,  and  magnification  of  0.33  for  expansion.  Thus,  virtual  space  had  to  be 
compressed  or  expanded  by  a  factor  of  three  in  order  for  observers  to  discriminate  a  shape 
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distortion  at  this  low  level  of  sensitivity.  In  addition,  there  was  a  great  deal  of  intra-subject 
variability.  There  appears  to  be  no  fixed  separation  of  the  underlying  signal  and  noise 
distributions,  rather  sensitivity  changes  greatly  from  trial  to  trial.  The  processes  that  are 
involved  in  recognition  of  regular  objects  appear  to  greatly  interfere  with  the  ability  to  judge 
displayed  space  simply  on  the  basis  of  projected  information. 

Implications 

The  theotetical  conclusions  to  be  drawn  from  this  work  seem  to  be  clear-cut.  With 
irregular  or  unfamiliar  targets,  and  novel  visual  display  systems,  the  geometric  projection  is  the 
major,  if  not  sole,  determiner  of  space  perception  based  on  graphic  displays.  For  display 
applications  intended  for  unusual  environments,  work  must  concentrate  on  increasing  display 
fidelity.  Discovery  of  basic  processes  in  perception,  especially  in  terms  of  the  integration  of 
several  different  sources  of  visual  information  (eg.  binocular,  monocular,  motion-carried)  is 
critical.  In  addition,  I  would  like  to  see  the  growth  of  exploratory  studies.  We  need  to  relate 
the  kinds  of  results  that  I  have  reported  to  actual  control  activities.  A  pressing  question 
concerns  the  relationship  between  perception  based  on  graphic  displays,  and  remote  piloting  and 
video  maneuvering. 

With  familiar  display  systems,  our  results  suggest  that  geometric  distortions  can  be 
discounted  by  the  perceptual  system.  The  discrepancy  between  the  actual  viewing  point  for  a 
display  and  some  assumed  correct  viewing  point  is  used  to  eliminate  the  effects  of  distortion  in 
space  perception.  An  obvious,  but  important  question  concerns  the  nature  and  amount  of 
experience  that  maximizes  this  effect.  How  can  we  train  display  operators  and  users  to  make 
them  maintain  perceptual  accuracy  in  spite  of  geometric  distortions? 

For  regular,  familiar  target  objects,  the  categorization  of  these  objects  may  reduce  or 
eliminate  sensitivity  to  spatial  information.  This  raises  important  questions.  What  is  the 
interaction  between  training  and  sophistication,  and  the  ability  to  accurately  use  spatial 
information?  Can  we,  for  certain  applications,  degrade  the  fidelity  of  a  display  effectively.  If 
details  of  spatial  information  are  unimportant  in  some  instances,  can  we  save  display  and 
computing  costs  by  using  symbols  rather  than  accurate  graphic  representations.  In  a  related 
vein,  if  sensitivity  to  distortion  is  low  in  some  cases,  can  we  more  effectively  use  bandwidth  by 
updating  displays  only  when  the  displays  are  perceptually  different. 

Future  challenges  lie  in  exploratory  developments  making  use  of,  and  further  driving 
additional  basic  research. 
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ABSTRACT 

This  report  presents  the  results  of  a  series  of  five  experiments  in  which 
we  sought  to  determine  the  effect  of  the  spatial  and  temporal  attributes  of  a 
dotted  stimulus  form  on  its  detectability  when  masked  by  random  dotted  visual 
noise.  The  stimulus  consisted  either  of  a  single  dot  that  was  repetitively 
flashed  or  of  a  straight  line  of  seven  dots.  The  number  of  noise  dots,  the 
interval  between  stimulus  dots,  dot  position,  trajectory  direction,  and 
regularity  of  the  spatial  and  temporal  intervals  were  examined  to  determine 
what,  if  any,  influence  these  stimulus  properties  exerted  on  stimulus 
detectability.  Among  the  more  surprising  results  of  our  study  was  the 
discovery  of  a  remarkable  insensitivity  to  the  temporal  irregularity  of  either  a 
flashing  single  dot  or  a  sequentially  plotted  line  of  dots.  In  addition,  an 
equivalent  insensitivity  to  spatial  position  irregularity  in  a  line  of  dots  was 
also  discovered  for  the  conditions  used  in  this  experiment.  The  significance  of 
these  findings  to  our  understanding  of  visual  image  processing  is  discussed  as 
well  as  possible  applications  of  the  methods  and  fin  .lings  emerging  from  these 
experiments.  Some  suggestions  for  future  research  are  also  presented. 
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INTRODUCTION 

The  explanation,  analysis,  and  understanding  of  visual  form  perception 
has  been  a  major  goal  of  experimental  psychology  throughout  its  history.  This 
problem  area  is  of  central  concern  for  the  simple  reason  that  our  relationship 
to  the  external  world  is  so  dependent  upon  our  ability  to  detect,  recognize  and 
classify  stimuli  as  well  as  to  choose  an  appropriate  response  to  them.  The 
history  of  the  problem  of  form  perception  contains  such  illustrious  names 
(sometimes  extremely  unexpectedly  and  sometimes  quite  expectedly)  as  Plato, 
Aristotle,  Democritus,  Aihazen,  Seneca,  Galen,  Avicenna,  Grossteste, 
Descartes,  Da  Vinci,  Vesalius,  Berkeley,  Hobbes,  Goethe,  Muller,  Hemholtz, 
Mach,  3ames,  Koffka,  Wertheimer,  and  Gibson.  There  is  also  a  large  company 
of  other  historical  figures,  as  well  as  a  growing  army  of  our  contemporaries 
who  have  all  been  concerned  with  various  aspects  of  the  form  perception 
problem. 

In  spite  of  this  broad  and  long  history  of  interest  in  the  problem  it  is 
startling  to  realize  how  infrequently  form  perceptionists  of  the  past  or  present 
have  asked  what  is  perhaps  the  fundamental  question  in  studies  of  this  genre. 
That  question,  whose  neglect  a  number  of  our  contemporaries  (e.g.,  Zusne, 
1970;  Sutherland,  1967)  have  also  noted,  is  —  What  are  the  specific  attributes 
or  characteristics  of  a  form  that  regulate  its  detectability  or  recognizability? 
Since  the  heydey  of  the  Gestalt  tradition,  only  a  few  psychologists  have 
approached  the  study  of  visual  form  perception  from  this  perspective  (e.g. 
Rock,  1973;  Brown  and  Owen,  1967),  and  then  usually  in  a  manner  that 
emphasized  some  simple  transformation  (e.g.  orientation),  some  general 
feature  (e.g.  compactness),  or  memory  rather  than  the  specific  geometry  of 
the  stimulus  form  itself. 

We  believe  that  there  are  three  main  reasons  for  the  neglect  of  this 
fundamental  question.  First,  there  is  as  yet  no  adequate  means  of  quantifying 
what  exactly  we  mean  by  the  word  "form".  While  some  authors  have  suggested 
statistical  families  of  forms  that  are  alike  in  some  general  way,  there  is  not 
yet  any  satisfactory  single  dimension  along  which  form  may  be  continuously 
measured  comparable  to  electromagnetic  frequency  in  color  research  or 
acoustic  frequency  in  pitch  research.  Furthermore,  neither  the  algebra  of  form 
suggested  by  Leeuwenberg  (1969,  1971)  nor  the  statistical  algorithms  for 
generating  individual  samples  of  broad  classes  of  form  (Attneave  and  Arnoult, 
1956;  Fitts  and  Leonard,  1957)  have  yet  proved  to  be  a  satisfactory  and 
acceptable  means  of  quantification  of  form  as  an  experimental  variable. 
Forms,  therefore,  are  usually  generated  in  a  more  or  less  arbitrary  manner  and 
are  often  defined  as  experimental  stimuli  on  the  basis  of  some  vaguely 
articulated  ad  hoc  rule.  This  difficulty  remains;  our  group  has  done  no  better 
than  our  predecessors  in  resolving  this  problem.  As  reported  in  a  later  section, 
our  stimuli  are  also  more  or  less  arbitrary,  although  in  some  cases  a  natural 
measure  (e.g.,  variance)  does  satisfy  the  immediate  needs  of  a  particular 
experiment. 
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The  second  reason  that  the  specific  attribute  problem  has  been  ignored  is 
that  heretofore  there  has  been  no  easy  way  to  manipulate  even  arbitrarily 
defined  forms  in  stimulus  displays.  The  advent  of  the  laboratory  computer, 
however,  has  ameliorated  this  difficulty  and  forms  of  great  variety  and 
complexity  in  two,  three,  and  even  four  dimensions  (i.e.,  X,Y,Z,t)  are  easily 
generated  in  many  laboratories  about  the  world  today. 

The  third  reason  that  the  attribute  problem  has  been  neglected  is  that 
the  manipulation  of  the  form  of  continuous  figures  usually  leads  to  a 
confounded  outcome.  That  is,  changing  the  global  arrangement  of  the  form 
also  often  covaries  other  local  features  or  attributes  (e.g.,  number  of  elements 
in  a  continuous  line,  angles,  etc.)  in  a  way  that  makes  the  actual  causal 
relationship  between  a  particular  attribute  of  the  form  and  a  measure  of  the 
perceptual  response  uncertain.  The  use  of  dot  patterns  at  least  partially 
ove_  '■  this  problem.  There  are  no  local  attributes  other  than 
"arra.1(_  ..ent"  when  one  is  dealing  with  dot  patterns;  as  long  as  the  number  of 
dots  remain  constant,  all  of  the  other  aspects  of  the  stimulus  can  be  subsumed 
under  the  single  factor.  On  the  other  hand,  "arrangement",  however  singular,  is 
not  itself  a  simple  term;  it  is  at  least  as  complicated  as  "form"  and  it  may 
involve  multidimensional  variation  when  a  single  attribute  is  changed. 
Nevertheless,  dot  patterns  can  be  manipulated  in  a  reasonably  straightforward 
manner. 

At  present  our  laboratory  is  carrying  out  a  program  of  research  aimed 
at  elucidation  of  the  factors  influencing  the  detection  cf  dotted  forms  in  a 
dynamic  stereoscopic  space.  Our  observers  perceive  what  appears  to  them  to 
be  a  three  dimensional  (cartesian)  volume  in  which  some  of  the  stimuli  may  be 
moving  or  flickering.  This  temporal  property  makes  our  experiments  four 
dimensional,  but  in  an  "Einsteinian"  rather  than  a  "hyperspace"  context.  That 
is,  our  "space"  is  one  defined  by  three  spatial  coordinates  and  one  temporal 
one,  and  not  four  spatial  ones.  Our  current  four  dimensional  studies  are  an 
outgrowth  of  studies  carried  out  earlier  on  analogous  detection  tasks  in  two 
dimensional  space  (as  summed  up  in  Uttal,  1975)  and  others  (Uttal,  Fitzgerald 
and  Eskin  1975,  A;  B)  in  which  we  examined  some  of  the  fundamental 
properties  of  stereoscopic  space  itself  using  random  dot  stereograms  in  the 
tradition  established  by  Julesz  (1960;  1971). 

One  of  the  most  important  aspects  of  both  the  earlier  and  the  present 
work  is  that  we  conceive  of  n  as  being  quite  limited  in  scope.  That  is,  we  are 
not  studying  all  stages  of  form  perception  in  these  experiments.  As  will 
become  evident  when  we  discuss  our  experimental  paradigm,  our  concern  is 
only  with  what  is  a  putatively  "primitive"  stage  of  form  detection  and  the 
stimulus  attributes  that  affect  that  stage.  It  is  also  important  to  appreciate 
that  our  goal  is  to  study  form  perception  and  not  short  term  memory  or 
stereopsis  themselves.  Others  such  as  Hogben  (1972),  Di  Lollo  (1980),  and 
Jonides,  Irwin,  and  Yantis  (1982)  share  with  us  an  enthusiasm  for  the  dot  as  a 
research  tool.  However,  our  g^al  here  is  to  use  persistance,  masking,  and 
binocular  disparity  as  vehicles  to  explore  the  perception  of  form  rather  than 
short  term  visual  memory,  the  target  of  their  studies.  This  is  a  goal  we  share 
with  Lappin  and  his  colleagues  (Lappin,  Doner,  and  Kottas.  1980;  Falzett  and 
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Lappin,  1981)  and  Gohansson  (e.g.,  1978)  who  also  use  d  t  patterns  as  a  means 
of  studying  form  perception. 

More  specifically,  the  long  term  goal  of  our  project  is  to  determine  how 
we  detect  dotted  forms  in  a  volume  and  to  infer  from  these  data,  how  we  see 
forms  in  general.  We  plan  to  achieve  this  goal  by  examining  the  detectability 
of  static  and  dynamic  single  dots,  dotted  lines,  dotted  surfaces,  and  dotted 
solids  in  a  variety  of  masking  environments.  In  particular,  we  are  attempting 
to  determine  what  aspects  of  the  spatial  and  temporal  arrangement  of  these 
dots  influence  their  detectability  when  they  are  embedded  in  camouflaging 
visual  noise  made  up  of  randomly  positioned  dots  or  dot  arrays.  Our  hope  is 
that  the  results  obtained  in  this  highly  abstract  stimulus  situation  will 
generalize  to  other  kinds  of  visual  stimuli,  and  that  what  we  learn  here  will 
tell  us  something  about  how  we  see  all  kinds  of  forms.  In  the  particular 
studies  that  are  presented  m  this  report  we  are  specifically  concerned  with 
determining  the  effects  ox  the  spatial  and  temporal  characteristics  of  single 
dots  and  dotted  lines  on  their  detection  in  noise  that  consists  of  briefly 
presented,  randomly  placed,  single  dots.  Our  experimental  paradigm  is  thus  a 
masking  experiment,  however,  it  is  not  intended  to  be  a.  study  of  masking  per 
se.  Masking  in  this  case  is  but  the  vehicle  we  use  to  study  form  detection. 

In  the  earlier  work  (Uttal,  1975),  the  two  dimensional  analog  of  the 
present  stereoscopic  experiments,  we  were  successful  at  proposing  a 
mathematical  model  based  on  the  autocorrelation  function  that  was  capable  of 
predicting  the  rank  order  detectability  of  sets  of  targets.  We  also  hope  to 
extend  that  model,  or  some  modification  of  it,  to  the  multidimensional  case 
embodied  in  the  dynamic  stereoscopic  stimulus  space  in  which  our  observers 
now  operate.  However,  these  initial  experiments  have  only  begun  to  provide 
the  information  required  for  such  modeling  and,  therefore,  this  report  will  not 
speak  to  that  theoretical  part  of  our  task. 

We  report  here  the  results  of  five  experiments  concerned  with  the 
detection  of  dots  and  dotted  lines.  In  Experiment  I,  we  consider  the 
detectability  of  evenly  spaced  dotted  lines  as  a  function  of  their  direction  and 
the  magnitude  of  regular  (equal)  temporal  intervals  between  the  plottir  0  of 
the  sequence  of  stimulus  dots  in  successive  equally  spaced  locations.  This  is 
the  master  experiment  for  the  dotted  line  study.  It  provides  the  basic 
parametric  data  against  which  the  results  of  the  other  experiments  will  be 
referenced.  Experiment  II  explores  the  effects  on  detectability  of  introducing 
irregularity  into  the  temporal  intervals  between  the  successive  dots  of  a 
dotted  line.  Experiment  III  determines  the  effect  of  introducing  irregularity 
into  the  spatial  separations  between  the  successive  dots  of  similar  dotted 
lines. 

Experiments  IV  and  V  deal  with  the  detectability  of  repetitively 
flashing  dots  positioned  at  a  single  point  in  space.  Experiment  IV,  like 
Experiment  I  is  the  master  experiment  that  provides  the  basic  parametric  data 
for  flashing  dots.  Experiment  V,  like  Experiment  II,  investigates  the  effect  of 
irregular  temporal  intervals  but  in  this  case  also  for  only  a  single  dot.  In  both 
Experiment  I  and  IV  we  have  also  parametrically  varied  the  density  of  the 
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masking  noise  to  oetermine  the  effect  of  this  very  important  variable  and  to 
determine  if  there  are  any  discontinuities  in  the  detectability  functions  over 
the  range  of  noise  densities  used  in  these  experiments. 

METHOD 


Observers.  In  each  of  the  experiments  we  report  here,  at  least  three 
and  usually  four  undergraduate  students  at  the  University  of  Michigan  were 
used  as  paid  observers  for  at  least  one  academic  term.  Each  was  tested  for 
normal  stereoscopic  vision  with  an  anaglyphic  screening  procedure  (Figure 
8.1-2*  from  Julesz,  1971)  and  self  reported  normal  or  corrected  refraction. 
One  observer,  however,  was  dissociated  from  the  project  after  demonstrating 
adequate  stereopsis  with  anaglyphs,  but  failing  to  display  discriminat.on  in  the 
computer  controlled  task.  The  data  reported  here  are  from  several  groups  of 
observers  in  two  sets  of  experiments  carried  out  several  years  apart. 
Adequate  replication  of  all  of  the  older  work  has  been  carried  out  to  assure 
that  no  significant  difference  in  results  occurred  as  a  result  of  new 
procedures  or  equipment.  (In  the  present  report  we  describe  only  a  new 
version  of  the  instrumentation.)  All  observers  were  pretrained  with  unmasked 
versions  of  the  stimulus  forms  used  in  each  experiment  for  several  days  prior 
to  the  actual  data  collection  sessions. 

General  Procedure.  All  of  the  experiments  reported  here  were  carried 
out  using  a  two  alternative,  forced  choice,  dot-masking  paradigm  in  which  the 
percentage  of  the  total  number  of  trials  in  which  the  stimulus  forms  were 
correctly  detected  was  the  criterion  of  performance.  Stimulus  forms  to  be 
detected  were  constructed  of  one  or  seven  prearranged  dots.  These  stimulus 
forms  are  presented  hidden  in  varying  numbers  of  randomly  placed  masking 
dots  -  subsequently  referred  to  as  visual  noise.  The  organized  stimulus  forms 
(e.g.,  a  single  repetitively  flashing  dot  or  sequentially  presented  straight  line 
of  seven  dots)  are  thus  interspersed  both  temporally  and  spatially  among  the 
random  masking  dots.  The  masking  dots  are  always  presented  at  constant 
interdot  temporal  intervals;  the  temporal  and  spatial  regularity  of  the  stimulus 
dots  are  both  experimental  variables  in  the  present  study. 

Figure  1  shows  a  typical  dotted  line  stimulus  form  consisting  of  seven 
dots  presented  in  fo'  r  stereoscopic  displays  in  successively  higher  levels  of 
visual  noise.  The  observer's  task  in  each  case  is  to  report  in  which  of  two 
sequential  stereoscopic  presentations  the  stimulus  form  is  located.  Each 
presentation  is  presented  as  a  dichoptic  pair  of  images  that,  when  perceptually 
fused,  creates  the  impression  of  a  cubical  volume  in  which  the  dots 
constituting  the  stimulus  forms  and  the  random  dotted  visual  noise  appear  at 
times  and  positions  depending  upon  the  design  of  the  particular  experiment. 
The  right  and  left-eyed  images  are  presented  on  Horizontal  halves  of  a  split 
screen  oscilloscope  coated  with  a  high  speed  P-15  phosphor.  The  observer 
views  the  two  images  through  prism  lenses  that  are  individually  adjusted  for 
coriifortable  convergence  for  each  session.  A  septum  divides  the  two  halves  of 
the  screen  so  that  neither  eye  sees  the  field  of  view  of  the  other  eye. 
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A  trial  consists  of  two  presentations;  either  the  first  or  the  second 
presentation  contains  a  stimulus  form  (a  repetitively  flashing  dot  or  a  line  of 
seven  dots)  plus  visual  noise,  in  which  case  the  other  presentation  contains 
exactly  the  same  noise  pattern  plus  an  additional  number  of  randomly  placed 
"dummy"  dots.  The  number  of  dummy  dots  is  equal  to  t number  of  dots  in 
the  stimulus  form  and  limited  in  spatial  extent  to  the  maximum  and  minimum 
values  of  the  dots  of  the  stimulus  form.  Dummy  dots  are  presented  at  the 
same  time  as  the  stimulus  form  dots  would  have  occurred  --  timing 
information  being  transposed  from  the  stimulus  dot  file  to  the  dummy  dot  file 
without  change  during  the  course  of  a  single  trial.  In  this  manner,  both 
presentations  contain  the  same  number  of  dots  and  temporal  patterns. 
Stimulus  form  alone  constitutes  the  sole  difference  between  the  two 
alternative  presentations.  The  observer's  task  is  to  specify  in  which  of  the  two 
presentations  the  organized  (as  opposed  to  the  dummy)  stimulus  form 
occurred. 

The  sequence  of  visible  events  in  each  trial  is  presented  in  a  precise 
order.  First,  a  single  fixation-convergence  dot  located  at  the  mathematical 
and  stereoscopic  center  of  the  apparent  cube  is  illuminated  for  one  second. 
The  purpose  of  this  dot  is  to  assist  the  observer  to  align  his  eyes  so  that  the 
subsequent  stimulus  information  is  properly  registered  for  stereoscopic 
viewing.  The  strong  perception  of  a  dot  filled  cube  obtained  in  this 
experiment  indicates  that  this  was  a  successful  strategy  in  spite  of  the  very 
brief  exposure  of  the  individual  dots:  Only  50  microseconds  elapse  before  the 
image  fades  to  less  than  1%  of  its  initial  luminance  on  the  P15  phosphor 
according  to  the  manufacturer's  specifications.  Immediately  following  the 
display  of  the  fixation-convergence  dot,  the  first  of  the  two  presentations 
occurs.  Each  presentation  lasts  for  1  second  during  which  masking  noise  dots 
are  continuously  presented  at  equal  intervals.  Because  of  the  persistance  of 
the  visual  system's  response  the  apparent  cube  appears  to  be  filled  with  a 
varying  number  of  dots  in  random  position,  an  illusion  not  unlike  a  flurry  of 
snowflakes.  The  particular  stimulus  form  chosen  determines  when,  as  well  as 
where,  its  constituent  dots  are  plotted  within  this  flurry  of  masking  dots.  The 
first  presentation  is  then  followed  by  a  one  second  period  in  which  the  solitary 
fixation-convergence  dot  is  again  presented.  The  second  of  the  two 
presentations  then  occurs.  Following  the  second  presentation  the  screen 
remains  dark  until  the  observer  responds  by  depressing  one  of  two  hand  held 
pushbuttons  indicating  that  he  believes  the  stimulus  form  is  in  the  first  or 
second  presentation  respectively.  At  that  point,  a  "plus"  or  a  "minus" 
indicating  either  a  correct  or  incorrect  choice  is  briefly  flashed  on  the 
oscilloscope.  The  cycle  then  repeats. 

The  entire  experimental  procedure  is  run  totally  automatically.  After 
the  control  program  has  been  initially  loaded  from  disc  memory  into  the 
working  memory  of  the  computer  at  the  beginning  of  each  day,  the  observer 
signs  on  at  the  computer  terminal  and  begins  the  experimental  session  by 
depressing  either  one  of  the  two  response  buttons.  At  the  end  of  fifty  minutes 
the  observer  terminates  the  session  and  the  resulting  data  are  immediately 
analyzed  by  the  computer  and  printed.  Pooled  data  from  several  observers 
and/or  conditions  are  subsequently  analyzed  by  another  more  comprehensive 
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Apparatus.  The  stereoscopic  stimuli  used  in  this  experiment  are 
generated  by  a  hybrid  computer  system  consisting  of  a  Cromemco  System  3 
digital  microcomputer  and  a  subsystem  of  Optical  Electronics,  Inc.  analog 
computer  components.  This  hybrid  computer  approach  circumvents  one  of  the 
most  difficult  problems  in  the  presentation  of  this  kind  of  haploscopic  stimuli. 
While  it  is  not  particularly  time  consuming  to  generate  the  tabular 
representation  for  any  single  dot  or  group  of  dots  in  a  digital  computer  (X,Y,Z, 
and  t  coordinates  can  be  generated  by  simple  algorithms  or  prestored 
information)  the  construction  of  the  actual  real  time  analog  representations  of 
such  mathematical  abstractions  is  a  much  more  difficult  programming  task. 
This  difficulty  is  exascerbated  in  the  highly  demanding  sub-millisecond  real 
time  environment  we  sought  to  achieve  in  the  present  study.  The  X,Y ,Z  and  t 
coordinates  for  each  dot  internally  represented  in  the  computer  must  be 
transformed  into  two  sets  of  two-dimensional  coordinates  (X^,Y^,t  and 
XRlYR,t)  with  the  proper  disparity,  perspective,  and  separation  to  project  a 
haploscopic  pair  of  images  at  the  proper  locations  on  the  two  halves  of  the 
oscilloscope.  Each  pair  of  dots  in  these  images  must  be  capable  of  beinj’ 
processed  by  the  visual  system  into  an  illusion  of  three  dimensional  space. 
The  transformation  from  X,Y,Z,  t  to  X.  ,Y,  ,t  and  XR,YR,t  involves  extensive 
trigonometric  calculations  that  would  quicWy  overload  tne  capacity  of  all  but 
the  largest  computers  if  attempts  were  made  to  carry  them  out  in  real  time. 

The  analog  subsystem  (shown  in  Fig.  2)  provides  a  means  of  finessing 
this  digital  processing  overload  difficulty.  The  trigonometric  problem  is 
solved  by  means  of  analog  circuitry  in  real  time  whenever  the  signals  to  plot  a 
haploscopic  pair  of  dots  on  the  oscilloscope  are  required.  It  is  only  necessary 
to  provide  this  subsystem  with  the  analog  voltages  representing  the  three- 
space  coordinates  X,Y,Z  at  the  appropriate  time.  These  analr  ,  Stages  are 
easily  and  quickiy  obtained  from  the  internal  digital  represen-  v  -r.  by  means 
of  high  speed  digital  to  analog  converters.  In  our  hybrid  computer  the  digital 
to  analog  converter  used  is  the  California  Data  Corporation  DA-100,  a  four 
channel  system.  Each  channel  is  capable  of  converting  any  single  dimension  of 
the  digital  representation  into  the  equivalenc  analog  voltage  in  approximately 
three  microseconds.  Three  channels  of  the  system  are  used  to  convert  the 
X,Y,Z  dimensions  and  one  to  regulate  the  spatial  separation  between  the  left 
and  right  eye  oscilloscopic  images.  The  Optical  Electronics  Inc.  components 
carry  out  this  conversion  in  what  is  easily  "real  time".  Disparity  and 
perspective  were  adjusted  with  external  regulating  potentometers  and  kept 
constant  throughout  the  experiments. 

The  band  pass  (DC  to  500k  Hz)  for  the  analog  Optical  Electronics  Inc. 
units  is  such  that  the  entire  set  of  trigonometric  computations  is  carried  out  in 
a  few  microseconds,  a  duration  comparable  to  the  settling  time  of  the  entire 
electronic  system  used  in  the  study  and  to  one  or  two  average  digital 
computing  instruction  execution  times.  One  thus  has  only  to  wait  for  one  or 
two  computer  instructions  before  sending  an  intensify  signal  (obtained  from 
one  bit  of  a  parallel  output  port)  to  the  oscilloscope  to  maintain  good  dot 
quality.  The  speed  of  generation  of  haploscopic  pairs  of  left  and  right  eye 
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images  is  thus  constrained  only  by  the  minimal  digital  computer  programming 
required  to  read  information  from  an  internally  stored  table  of  X,Y,Z,t  values 
(all  of  which  had  either  been  computed  or  arbitrarily  generated  prior  to  the 
trial)  to  the  digital  to  analog  converters. 

The  times  at  which  the  stimulus  form  and  visual  noise  dots  in  each  trial 
are  presented  are  controlled  by  a  system  of  three  real  time  clocks  located  in 
the  digital  computer.  The  first  clock  regulates  the  times  at  which  dots 
comprising  the  stimulus  forms,  will  be  plotted.  Each  dot  of  the  stimulus  is 
represented,  as  we  have  noted  by  four  coordinates  (X,Y,Z,t).  The  t  value  is 
used  to  set  this  first  clock  so  that  the  computer  will  be  interrupted  at  the 
appropriate  time  from  a  waiting  routine  to  plot  the  left  and  right  eye  images 
(X,  ,Y,  ,  and  XR,YR)  of  X,Y,Z,t.  The  second  clock  is  set  to  interrupt  the 
computer  at  regular  intervals  --  defined  at  load  time  by  the  experimenter. 
This  is  the  interval  between  the  regularly  spaced  (in  time)  noise  dots.  These 
randomly  (in  space)  positioned  dots  are  plotted  continuously  during  the  entire 
period  of  each  stimulus  presentation  --  one  second  as  controlled  by  the  third 
clock. 


The  field  of  view  presented  to  each  eye  on  the  oscilloscope  screen  is 
shielded  by  a  black  paper  through  which  a  pair  of  5.4  deg  x  5.4  deg  apertures 
had  been  cut  for  the  left  and  right  image  respectively.  The  viewing  distance 
from  the  observer's  cornea  to  the  oscilloscope  surface  is  31.75  cm.  The 
screen  is  far  enough  from  the  observer  and  the  persistance  of  the  oscilloscope 
is  short  enough  that  each  dot  appeared  to  be  virtually  point-like  in  both  lime 
and  space.  I.uminosity  was  adjusted  initially  with  a  Salford  S.E.I.  photometer 
to  approximately  0.1  candles/m  . 

The  two  pushbuttons  used  by  the  observer  to  respond  are  connected  to 
Schmitt  triggers  designed  to  smooth  switch  contact  "bounce."  The  outputs  of 
the  Schmitt  triggers  are  fed  to  the  input  of  a  parallel  bit  input  port  of  the 
computer  for  acquisition  and  processing. 

The  Perceived  Cubical  Space.  Stereoscopic  depth  is  defined  by  the 
disparities  between  X,,YL  and  XR,YR  for  each  dot.  Retinal  disparity, 
however,  does  not  define  absolute  ciepths  but  rather  cues  the  observer  to 
relative  depths;  i.e.,  a  dot  may  be  perceived  in  front  of  or  in  back  of  the 
reference  depth  (the  point  in  depth  at  which  the  lines  of  sight  converge). 
Furthermore,  in  the  hybrid  computer  system  utilized  in  the  present  study,  the 
electronic  disparity  adjustment  is  uncalibrated  and  arbitrary.  It  is,  therefore, 
necessary  to  calibrate  the  actual  disparity  of  dot  pairs  by  direct  measurements 
from  photographs  of  special  test  patterns  on  the  display  screen  and  from 
measurements  of  the  distance  from  the  observer's  eye  to  the  display  screen. 
These  angular  measurements  are  then  related  to  the  Z  axis  values  stored 
within  the  computer.  It  should  be  noted  that  this  relationship  between 
disparity  and  internally  represented  Z  values  is  accurate  only  for  our  system 
and  as  it  is  adjusted  for  these  experiments.  Within  this  constraint,  we 
determined  that  if  the  observer  fixated  on  the  fixation-convergence  dot 
centered  in  the  cube,  then  the  maximum  crossed  relative  disparity  for  a  dot 
positioned  on  the  front  surface  of  the  apparent  cube  was  14  min.  of  visual 
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angle  and  the  maximum  uncrossed  disparity  for  a  dot  positioned  on  the  rear 
surface  of  the  apparent  cube  was  also  14  min.  of  visual  angle.  These 
maximum  crossed  and  uncrossed  disparity  values  were  arbitrarily  chosen  so 
that  the  perceived  space  appeared  as  close  to  an  apparent  cube  as  possible. 
Because  of  the  several  stages  of  transformation  involved,  all  disparity  values 
should  be  considered  to  be  approximate.  Furthermore,  in  some  of  tne 
experiments  reported  here  less  than  this  full  range  of  disparity  was  utilized. 

EXPERIMENTAL  DESIGN  AND  RESULTS 


Experiment  I. 


Design  and  rationale:  Experiment  I  is  the  foundation  study  for  all 
experiments  involving  straight  lines.  In  particular,  this  first  experiment  was 
designed  to  examine  the  detection  of  regular  dotted  lines  in  a  visual  noise 
filled  stereoscopic  space.  Regular  dotted  lines,  for  the  purpose  of  this 
experiment,  are  defined  as  those  in  which  the  dots  are  separated  by  equal 
intervals  in  both  time  and  space.  Fig.  3  displays  in  a  graphic  manner  the  four 
different  diagonally  oriented  dotted  line  stimuli  that  were  used  in  this 
experiment  superimposed  in  a  Single  drawing.  However,  it  must  be 
remembered  that  only  one  of  these  lines  is  used  in  any  one  stimulus  trial.  It 
always  consists  of  seven  dots.  The  outer  outline  cube  in  Fig.  3  represents  the 
total  extent  of  the  volume  in  which  the  dotted  visual  noise  is  evenly 
distributed.  This  apparent  cubicle  volume  is  defined  by  the  5.4  deg  by  5.4  deg 
areas  presented  to  each  eye  in  the  X-Y  plane  and  by  disparities  ranging  from 
14  minutes  (crossed)  to  14  minutes  (uncrossed).  Stimulus  forms  are  presented 
in  the  slightly  smaller  volume  indicated  by  the  inner  outline  cube.  The  X  and 
Y  dimensions  of  the  smaller  cube  are  both  limited  to  3.25  deg.  The  apparent 
depth  of  the  first  and  last  dots  of  each  diagonal  line  is  set  by  disparity  values 
of  12.25  min  uncrossed  and  12.25  min  crossed  respectively.  The  dots  of  each 
stimulus  line  are  sequentially  plotted  from  the  back  plane  of  the  inner  cube  to 
the  front  plane  as  indicated  by  the  arrow  heads  in  Fig.  3.  Neither  the  inner 
nor  outer  outline  cubes  are  ever  visible  to  the  observer. 


The  direction  of  the  dotted  stimulus  line  is  one  of  the  parameters 
manipulated  in  this  experiment.  We  explored  this  variable  to  determine  if 
visual  space  is  isotropic  for  this  kind  of  visual  information  processing.  Three 
other  parameters  influencing  line  detection,  however,  were  the  main  targets 
of  our  research  in  this  experiment.  These  three  were  plotting  interval,  noise 
density,  and  viewing  coi.  ,;tion.  To  examine  the  effect  of  interval,  the  seven 
dots  composing  each  stimulus  line  are  plotted  in  sequential  order  with  the 
delays  between  successive  dots  varying  from  trial  to  trial.  The  intervals  used 
in  this  experiment  include  10,  20,  30,  40,  50,  60,  and  70  msec  respectively. 
The  middle  dot  —  the  fourth  —  is  always  plotted  at  the  temporal  midpoint 
(t=500  msec)  of  the  presentation  interval.  At  the  shortest  interval,  the  entire 
line  of  dots  appears  to  the  observer  to  be  plotted  simultaneously.  At  longer 
intervals,  the  dots  appears  to  be  successively  plotted  giving  rise  to  an 
increasingly  strong  impression  of  a  single  dot  in  apparent  movement,  but  at  a 
progressively  slower  velocity  as  the  selected  interdot  interval  increased. 
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A  major  reason  for  studying  the  effects  of  interval  is  to  compare  what 
a  priori  might  have  been  hypothesized  to  be  compensatory  effects  of  apparent 
motion  on  non  simultaneous  dot  plotting.  We  knov  that  simultaneously 
appearing  lines  of  dots  are  easily  detected,  and  it  is  obvious  that  there  should 
be  some  uegredation  of  line  detectability  at  very  long  intervals.  There  was, 
however,  the  possibility  that  an  increase  in  apparent  motion  might  compensate 
for  the  loss  in  simultaneity.  It  was  not  possible,  therefore,  to  predict  at  the 
outset  with  any  certainty  what  the  effect  of  interval  would  be. 

All  of  the  four  directions  and  the  seven  intervals  used  in  this 
experiment  are  presented  in  each  daily  session.  On  seperate  daily  sessions, 
however,  the  two  other  parameters  --  viewing  condition  and  noise  density  -- 
are  varied.  To  determine  the  effects  of  viewing  condition,  each  daily  session 
was  repeated  six  times  at  each  noise  density  --  twice  using  dichoptoptic 
viewing  (in  which  stereopsis  was  possible)  and  twice  using  an  eye  patch  over 
each  eye  so  that  only  monocular  cues  were  available.  Our  purpose  here  was  to 
determine  what  advantage,  if  any,  was  gained  from  stereopsis.  Five  different 
noise  densities  were  chosen  such  that  the  stimulus  line  was  embedded  in  125, 
166,  250,  500,  and  1000  dots  per  second  respectively  in  this  order.  Following 
the  descending  series,  the  entire  experiment  was  repeated  in  reverse  order. 
Thirty  sessions  (3  viewing  conditions  x  5  noise  levels  x  2  series)  were  thus 
required  to  complete  this  experiment. 

Results:  Tne  major  results  of  Experiment  1  are  plotted  in  Figs.  4,  5,  6, 
7,  and  8  in  order  for  masking  noise  densities  of  125,  166,  250,  500,  and  1000 
dots  respectively.  On  each  of  these  graphs,  the  abscissa  represents  the 
temporal  interval  between  plotting  each  of  the  dots  making  up  the  stimulus 
line.  The  ordinate  represents  the  proportion  of  trials  in  which  the  observer 
selected  the  correct  presentation;  i.e.,  the  one  in  which  the  stimulus  line 
rather  than  the  dummy  noise,  was  present.  The  three  parametric  curves  in 
each  of  these  figures  represent  the  data  obtained  for  the  three  viewing 
conditions  on  three  successive  days.  The  data  obtained  from  all  four  line 
directions  have  been  pooled  to  produce  these  graphs. 

Three  main  results  are  to  be  noted  in  this  set  of  figures.  First,  the 
general  trend  produced  by  varying  noise  density  is  evident.  The  overall 
performance  of  the  observers  decreases  as  the  noise  density  increases.  Under 
optimum  conditions  of  minimal  noise,  binocular  viewing,  and  the  briefest 
interdot  intervals,  (data  typified  by  the  left  hand  portion  of  Fig.  4)  observers 
perform  at  the  95%  correct  detection  level,  a  score  that  is  about  the  best  that 
can  be  expected  in  experiments  of  this  kind.  On  the  other  hand,  when  the 
visual  noise  is  the  densest,  the  temporal  intervals  between  the  dots  of  the 
stimulus  line  are  long,  and  only  monocular  viewing  is  allowed  (as  exemplified 
by  the  right  hand  side  of  Fig.  8),  observers  perform  at  virtually  chance  levels 
(50%  for  the  two  alternative  forced  choice  design  used  here). 

Second,  the  effect  of  viewing  condition  is  also  clear.  For  virtually  all 
experimental  conditions,  there  is  a  clear  advantage  obtained  by  stereoscopic 
viewing  in  this  detection  task.  This  effect  is  modulated  by  ceiling  effects  for 
low  noise  densities  and  floor  effects  for  high  noise  densities,  but  the 
stereoscopic  advantage  is  pervasive  throughout  all  five  graphs.  This  advantage 
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is  substantial:  in  some  conditions  it  is  greater  than  12  percentage  points,  a 
value  which  is  over  a  quarter  of  the  range  of  responses  obtainable  in  this  type 
of  experiment.  On  the  other  hand,  differences  between  the  two  monocular 
viewing  conditions  are  small. 

Third,  and  most  important  for  the  purposes  of  this  study,  there  is  also 
an  unequivocal  and  major  effect  of  dot  plotting  interval  evidenced  in  this 
experiment.  Any  hypothetical  compensation  effect  of  apparent  motion  is 
obviously  swamped  out  by  the  loss  of  the  much  stronger  influence  of 
simultaneous  plotting.  Dotted  stimulus  lines  become  progressively  less 
detectable  as  the  interval  between  them  increases.  Indeed,  the  slope  of  the 
function  relating  detectability  and  interdot  interval  is  virtually  constant. 
There  is  not  the  slightest  suggestion  of  even  a  slowing  of  the  diminution  in 
detectability  at  long  intet  vals  --  a  phenomenon  that  would  have  been  expected 
if  apparent  motion  had  any  significant  influence  on  detectability.  Whatever 
detection  mechanism  is  at  work  here,  apparent  motion  can  not  substitute  for 
simultaneity.  As  we  shall  see  later,  however,  irregularities  in  time  and  space 
do  seem  to  be  smoothed  over  by  stimulus  conditions  that  produce  apparent 
motion. 

Figure  9  displays  the  results  of  the  other  major  parameter  of  this 
experiment  ~  track  direction.  Only  data  from  the  166  noise  dot/sec  condition 
have  been  presented  here,  but  all  other  dot  densities  produce  similar  results. 
These  data  strongly  suggest  that  visual  space  is  isotropic  —  there  is  no 
advantage  accruing  to  any  of  the  four  track  directions  for  the  perceptual 
mechanisms  underlying  dotted  line  detection.  This  insensitivity  to  orientation 
and  direction  in  three  dimensions  is  in  accord  with  our  earlier  findings  (Uttal, 
1975)  for  two  dimensional  stimuli. 

Experiment  II. 

Design  and  Rationale:  Experiment  II  investigates  the  effect  of 
temporal  irregularity  on  the  detectability  of  a  dotted  stimulus  line.  The  main 
independent  variable  in  this  experiment  is  variability  of  the  intervals  of  time 
between  the  successivel>  plotted  dots  of  the  line.  The  mean  interval  is 
arbitrarily  set  at  50  msec.  This  value  gives  a  moderately  high  average 
detection  score  of  approximately  85%  for  the  group  of  observers  used  in  this 
experiment.  Interval  irregularity  is  measured  in  terms  of  standard  deviation 
from  the  mean  50  msec  value.  Standard  deviations  of  0,  2.9,  6.45,  10.4,  14.4, 
16.8,  and  20.4  msec,  are  utilized.  A  standard  deviation  of  16.8,  for  example, 
corresponds  to  an  interval  sequence  of  75,  35,  25,  50,  65,  and  50  msec 
respectively.  The  masking  noise  density  is  kept  constant  throughout 
Experiment  II  at  250  dots/sec  and  only  the  dichoptic  viewing  condition  is  used 
to  test  the  detectability  of  straight  lines  of  seven  evenly  spaced  colinear  data. 
Each  observer  was  run  twice  under  these  conditions- 

Results:  Figure  10  depicts  the  results  of  this  examination  of  the 
effects  of  temporal  interval  irregularity  on  dotted  line  detection. 
Surprisingly,  there  is  virtually  no  observable  effect  of  irregularity  measured  in 
this  experiment.  The  visual  system,  however  sensitive  it  may  be  to  mean 


95 


Uttai,  Azzato  and  Brogan 


interval,  appears  to  be  totally  insensitive  to  even  the  most  extreme  temporal 
interval  irregularities  when  measured  by  this  dotted  form  detection  paradigm 
at  a  50  msec  mean  interval. 


Experiment  III. 

Desift.' .  and  Rationale:  In  our  earlier  work  (Uttai  1975)  spatial 
irregularity  had  been  shown  to  be  a  powerful  determinant  of  the  detectability 
of  straight  lines  in  two  dimensional  space.  We  had  initially  assumed, 
therefore,  that  spatial  irregularity  would  also  be  a  powerful  influence  on 
detectability  and  had  not  planned  to  attempt  to  confirm  tnis  presupposition. 
However,  the  surprising  results  of  Experiment  II  suggested  that  this  hypothesis 
should  indeed  be  verified  and  measured.  Therefore,  Experiment  III  was 
designed  utilizing  dotted  line  stimuli  consisting  of  seven  dots  with  regular 
temporal  intervals,  but  irregular  dot  spacing.  In  this  experiment,  the  standard 
deviation  of  the  spacing  is  used  as  the  independent  variable;  specifically,  the 
spatial  coordinates  of  an  evenly  spaced  dotted  line  are  changed  to  create 
irregular  intervals  proportionately  equivalent  to  the  values  of  temporal 
irregularity  used  in  Experiment  II.  That  is,  where  there  was  a  10%  increase  in 
one  temporal  interval  and  a  compensating  10%  decrease  in  another  temporal 
interval  (designated  as  +10%)  in  Experiment  II,  we  introduced  an  equivalent 
10%  change  in  two  spatial  separations  in  Experiment  III.  Seven  combinations 
of  separation  changes  were  used  in  this  experiment  defining  progressively 
increasing  spatial  irregularity  values.  These  combinations  are  +0;  +10%;  +10% 
and  +20%;  +20%  and  +30%;  +50%;  +30%  and  +50%;  and  finally,  +50%  and  +50%. 
Because  of  the  arbitrary  value  of  the  Z-axis,  no  particular  units  can  be 
associated  with  the  actual  Euclidean  distances  corresponding  to  these 
irregularity  values  (i.e.,  to  add  degrees  of  visual  angle  subtended  in  the  X-Y 
plane  to  seconds  of  stereodisparity  would  be  meaningless.)  Therefore,  we 
have  simply  designated  the  seven  irregularity  values  as  0,  I,  2,  3,  4,  5,  and  6  on 
Fig.  11.  A  single  regular  temporal  interval  value  of  50  msec  and  a  single 
noise  level  of  250  dots/sec  were  used  in  this  experiment.  Only  the  dichoptic 
viewing  condition  was  used  and  each  of  four  observers  was  run  twice  under 
these  conditions. 

Results:  The  results  of  Experiment  III  are  plotted  in  Fig.  11.  The 
outcome  is  even  more  surprising  than  that  of  Experiment  II.  These  data 
indicate  that  there  is  virtually  no  effect  of  spacing  irregularity  when  the  dots 
in  the  line  are  separated  by  a  period  of  time  that  is  long  enough  to  produce  a 
substantial  apparent  motion!  This  is  a  remarkable  result  in  light  of  the  fact 
that  the  detectability  of  dotted  lines  in  the  two  dimensional  case  in  which  all 
of  the  dots  are  presented  simultaneously  is  extremely  sensitive  to  spacing 
irregularity.  Yet,  in  this  dynamic  three-dimensional  case  there  is  but  the 
slightest  suggestion  (2  or  3%)  of  a  diminution  of  the  response  accuracy  at  the 
greater  irregularity  values.  In  the  two  dimensional  simultaneous  dot 
presentation  case,  the  difference  obtained  for  comparable  conditions  was  12  to 
14%  (See  Fig.  2-13  in  Uttai,  1975). 
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Experiment  IV. 

Design  and  Rationale:  Experiment  IV  is  the  foundation  experiment  for 
the  study  of  the  detectability  of  repetitively  flashing  single  dots  embedded 
within  a  volume  of  randomly  placed  visual  noise  dots  (each  of  which  flashes 
only  once).  Three  independent  variables  are  manipulated  in  this  experiment. 
First,  the  interflash  interval  between  successive  flashes  of  the  stimulus  dot  is 
varied.  It  must  be  remembered  that  the  stimulus  dot  is  distinguished  from  any 
of  the  random  noise  dots  only  by  the  fact  that  it  is  flashed  4  times  in  a  single 
position  rather  than  only  once.  The  three  regular  temporal  intervals  between 
successive  flashes  are  set  at  one  of  the  values  50,  100,  150,  and  200  msec  for 
each  trial.  Stimuli  with  these  four  interval  values  are  presented  in  random 
order  during  each  daily  session.  The  stimulus  dot  in  each  presentation  is 
always  timed  such  that  the  stream  of  repetitive  flashes  is  centered  at  the 
temporal  midpoint  (500  msec).  The  same  time  pattern  was  used  in  the  dummy 
position  presentation,  but  no  dot  position  was  repetitively  flashed  in  this  case. 

The  second  independent  variable  is  the  position  of  the  dot.  The  flashing 
dot  occupied  any  one  of  the  seven  possible  positions  shown  in  Fig.  12  in  each 
trial.  The  outer  cube  shown  in  this  drawing  delimits  the  5.4  deg  x  5.4  deg  x  28 
min.  disparity)  cube  perceived  by  the  observer.  The  inner  outline  cube  depicts 
the  seven  possible  locations  of  the  flashing  dot  in  each  trial,  but  it,  like  the 
outer  outline  cube,  is  not  visible  to  the  observer.  For  example,  position  1  is 
situated  at  the  perceived  center  of  the  apparent  cube,  i.e.,  the  location  of  the 
fixation-convergence  point.  As  another  example,  location  5  is  centered  on  the 
right  hand  side  of  the  inner  outline  cube.  Which  of  the  seven  positions  is  used 
in  each  trial  is  chosen  randomly  prior  to  each  trial. 

The  third  parameter  varied  in  this  experiment  is  the  masking  noise 
density.  Densities  of  10,  14,  20,  33,  and  100  dots/sec  were  utilized.  Since  no 
differences  were  observed  in  left  and  right  eye  monocular  viewing,  and  to 
provide  a  check  that  monocularity  per  se  was  not  accounting  for  the 
difference  obtained  in  other  experiments,  our  control  for  stereopsis  in  this 
case  was  binocular  viewing.  Thus,  only  two  viewing  conditions  are  used  in  this 
experiment,  the  standard  dichoptic  one  which  allowed  stereoscopic  perception 
and  the  binocular  one  in  which  both  of  the  observer's  eyes  viewed  the  left  eye 
image  of  the  stereoscopic  pair.  In  the  binocular  condition,  therefore,  no 
disparity,  and  thus  no  perceived  depth,  is  present. 

The  experiment  was  designed  so  that  each  daily  session  included  all 
possible  combinations  of  the  7  positions  and  4  flashing  rates  presented  in 
random  order,  but  viewing  condition  and  noise  density  were  held  constant  each 
day.  The  experiment  was  performed  dichoptically  and  then  binocularly  on 
alternative  days.  On  successive  pairs  of  days,  the  noise  density  was  varied 
starting  from  the  minimum  value  of  10  dots  and  ending  on  the  9th  and  10th 
days  with  the  maximum  values  of  100  dots/sec.  The  entire  experiment  was 
then  repeated  varying  masking  noise  dot  densities  in  the  descending  order. 

Results:  The  results  of  this  foundation  experiment  for  flashing  single 
dots  are  plotted  in  three  separate  graphs.  Figure  13  displays  the  effect  on 
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detection  of  varying  the  masking  noise  density.  As  expected,  there  is  a 
progressive  decline  in  detectability  of  the  flashing  dot  as  the  noise  density 
increases.  Nevertheless,  it  is  somewhat  surprising  to  note  that  a  single  dot 
flashing  only  4  times  is  still  partially  detectable  (i.e,  at  better  than  chance 
levels)  even  though  it  is  camouflaged  by  the  frenetic  blinking  of  100  random 
dots.  The  distinct  advantage  of  the  stereoscopic  view  over  the  binocular  one 
is  also  clearly  evidenced  here  just  as  stereoscopic  viewing  proved  to  be 
superior  to  the  monocular  viewing  conditions  in  Experiment  I.  However,  the 
disadvantage  obtained  with  binocular  viewing  in  Experiment  IV  is  only  half  of 
that  obtained  in  the  earlier  experiments  when  monocular  viewing  was  used  on 
the  central  condition. 

The  influence  of  the  position  of  the  flashing  dot  stimulus  is  plotted  in 
Fig.  14.  The  only  dot  position  that  appears  to  have  any  substantial  advantage 
over  the  others  is  position  1  --  the  one  located  at  the  very  center  of  the  inner 
cube.  The  only  dot  that  appears  to  have  any  substantial  disadvantage  is  the 
one  located  in  the  bottom  rear  lower  corner  of  the  inner  cube.  Other  than 
that,  all  dot  locations  appear  to  have  roughly  equal  probability  of  detection. 
However,  once  again  the  stereoscopic  advantage  is  clear  --  only  position  4 
seemed  to  not  display  this  advantage  and  we  believe  this  to  be  a  spurious 
fluctuation  rather  than  a  true  nondifference. 

Finally,  Fig.  15  plots  detectability  for  all  data  collected  at  all  noise 
levels  plotted  as  function  of  the  interflash  interval.  Most  interestingly,  the 
resulting  curve  is  non  monotonic.  Peak  detectability  occurs  at  an  interval  of 
100  msec  and  thus  there  is  a  sharp  decline  in  detectability  for  longer  inter¬ 
flash  intervals  and  a  less  sharp  decline  for  shorter  ones. 

Experiment  V. 

Design  and  Rationale:  Experiment  V  is  the  analog  of  Experiment  II  in 
that  it  is  concerned  with  temporal  irregularity.  In  this  experiment,  however, 
the  regularity  of  the  temporal  intervals  between  successive  flashes  of  a  dot 
stimulus  positioned  at  a  single  point  in  space  (rather  than  along  a  dotted  line) 
is  varied  as  the  independent  variable.  The  temporal  irregularity  is  measured  in 
units  of  standard  deviation,  about  a  mean  flicker  interval  of  iOO  msec.  Six 
values  of  this  measure  are  used  including  0,  4.1,  8.2,  12.2,  16.3,  and  20.4, 
msec.  The  combinations  used  here,  therefore,  are  +0%;  -r5%:  +10%;  +15%; 
+20%;  and  finally  +25%.  In  order  to  use  this  full  range  of  six  irregularity 
values,  the  number  of  dot  positions  utilized  in  each  daily  session  in  Experiment 
V  had  to  be  reduced  from  the  seven  locations  used  in  Experiment  IV  to  four. 
The  four  positions  utilized  are  those  numbered  1,  4,  5,  and  6  in  Fig.  12. 

Results:  Figure  16  displays  the  results  of  Experiment  V.  Once  again,  as 
in  Experiment  II  there  is  a  remarkable  and  surprising  insensitivity  to  wide 
variations  in  the  regularity  of  temporal  intervals  between  successive  flashes  of 
a  single  dot.  The  curve  is  virtually  flat  over  the  full  range  of  of  interval 
irregularity  values. 
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DISCUSSION 

These  then  are  the  findings  that  we  have  obtained  in  our  study  of  the 
influence  of  stimulus  iorm  on  the  detectability  of  dotted  patterns  in 
stereoscopic  space.  The  discussion  now  presented  is  divided  into  two  parts. 
First  we  will  consider  the  significance  of  our  work  in  helping  to  understand  the 
nature  of  vision  in  general  and  dotted  form  detection  in  particular.  We  will 
then  consider  some  practical  matters  upon  which  we  believe  our  data  impinges 
and  make  some  suggestions  concerning  possible  future  research  efforts. 

Perceptual  Significance.  It  is  important  to  remind  the  reader  that  all 
of  the  experiments  reported  here  are  confounded  by  the  presence  of  monocular 
cues.  Both  flashing  dots  and  dotted  straight  lines  are  detectable  to  a  certain 
degree  in  monocular  viewing  conditions.  However,  one  of  the  major  findings 
that  has  emerged  from  this  study  is  our  confirmation  of  earlier  work  (e.g., 
Smith,  Cole,  Merritt,  and  Pepper,  1976;  Pepper,  Cole,  Merritt,  and  Smith,  1978) 
that  this  confounding  is  not  total  and  that  there  is  a  substantial  advantage  in 
detecting  orderly  (in  time  or  space)  dots  in  a  volume  filled  with  masking  dots. 
This  result  seems  to  be  ubiquitous  and  uniform  within  the  limits  of  statistical 
fluctuation  of  the  kinds  of  experiments  carried  out  in  our  laboratory.  One  has 
only  to  compare  the  last  frame  of  Fig.  1  when  viewed  stereoscopically  and 
when  viewed  monocularly  to  appreciate  the  advantage  of  stereoscopic  viewing 
for  complex  stimuli  of  this  kind. 

How  does  one  account  for  the  advantage  of  stereoscopic  viewing  in  the 
masking  paradigm?  The  answer  to  this  question  is  probably  closely  related  to 
one  that  may  be  suggested  to  account  for  the  data  obtained  by  Fox  and  his 
colleagues  (Fox,  1980;  1981:  Lehmkuhle  and  Fox,  1980)  for  metacontrast  and 
contour  interaction  and  by  Ogle  and  Mershon  (1969)  and  Mershon  (1972)  for 
simultaneous  contrast  as  a  function  of  the  apparent  depth  between  the 
inducing  and  induced  stimuli.  The  central  idea  in  this  speculative  suggestion  is 
that  any  explanation  of  these  phenomena  based  upon  peripheral  lateral 
inhibitory  interactions  is  incapable  of  accounting  for  the  associated  decline  in 
the  interactive  effects  and,  therefore,  the  responsible  process  must  be  a 
function  of  the  central  nervous  system.  In  these  experiments  the  two 
dimensional  attributes  of  the  image  projected  on  either  retina  remain  constant 
as  disparity  changes;  i.e.,  the  horizontal  spatial  separation  between  the 
foreground  and  background  elements  of  the  stimuli  remain  nearly  constant. 
Thus  any  putative  peripheral  interaction  should  remain  constant. 
Nevertheless,  there  is  a  progressive  reduction  of  the  magnitude  of  both 
simultaneous  and  meta-type  contrast  as  the  apparent  depth  difference 
increases.  Thus,  the  -'distance"  between  the  two  interacting  stimulus  elements 
that  seems  to  be  significant  is  not  the  distance  projected  onto  the  2- 
dimensional  physical  plane  of  the  retina,  but  rather  the  "true"  volumetric 
distances  in  the  perceptually  constructed  X,Y,Z  volume. 

In  our  experiments,  the  same  sort  of  explanation  seems  to  hold.  That 
is,  the  effect  of  the  mask  ng  noise  is  not  a  function  of  its  density  in  the 
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phvsically  projected  retinal  plane,  but  rather  of  its  density  in  the  apparent 
three  dimensional  space.  Therefore,  spreading  the  dots  further  apart  in  the 
perceptually  constructed  depth  dimension  is  the  equivalent  of  spreading  them 
further  apart  in  the  projected  plane.  Since  volumes  have  more  constituent 
unit  elements  than  planes,  the  average  density  of  the  masking  noise  must 
decrease  when  a  plane  is  extruded  into  a  volume  even  though  there  is  no 
change  in  the  number  of  visual  masking  dots  present. 

In  a  more  philosophical  vein,  we  should  note  that  the  results  obtained  by 
Fox  and  his  associates,  Gogel,  Mershon,  as  well  as  those  from  both  this  present 
study  and  earlier  studies  from  our  laboratory  (Uttal,  Fitzgerald  and  Tucker, 
1575b)  supporting  the  perceptual  equivalence  of  the  X,  Y,  and  Z  axes  represent 
an  extraordinary  outcome.  These  data  jointly  suggest  that  Z  axis  distances, 
constructed  from  indirect  and  nonisomorphic  aspects  (disparity)  of  the 
stimulus,  are  just  as  "real"  in  a  perceptual  sense  as  are  t!  X  and  Y  distances 
that  do  have  a  more  direct  and  isomorphic  physical  counterpart  (retinal 
distance).  Considering  that  stereoscopic  depth  is  the  indirect  result  of 
invariance  computations  based  on  tl..s  magnitude  of  minute  retinal  disparities, 
the  unavoidable  conclusion  to  which  we  are  compelled  is  that  the  X  and  Y 
distances  may  themselves  also  be  "constructs"  calculated  on  the  basis  of  some 
equally  indirect  relationship  between  the  retinal  image  and  the  perceived 
plane.  It  is,  according  to  this  point  of  view,  only  fortuitous  that  the  perceived 
space  appears  to  be  isomorphic  to  the  stimulus  space  in  the  X  and  Y 
dimensions.  Thus,  this  line  of  thought  suggests  that  there  is  nothing  especially 
direct  or  real  about  X  and  Y,  but,  rathc-r,  they  are  as  indirect  as  the  Z 
dimension. 

Pursuing  this  line  of  thought,  the  totality  of  our  visual  experience  can 
thus  be  considered  to  be  indirect,  not  only  the  obviously  constructed 
dimensions  that  are  computed  i,  m  invariant  coding  relationships  among 
alternative  representations  of  the  stimulus  object.  While  this  logic  leads  to  a 
model  of  a  perceptual  world  that  is  in  practical  terms  no  different  than  the 
classic  deterministic  stimulus-response  point  of  view,  it  is  substantially 
different  in  terms  of  the  epistemological  model  that  must  be  invoked  to 
explain  how  we  actually  perceive. 

Another  less  weighty  aspect  of  our  study  concerns  the  different  results 
that  are  obtained  with  binocular  viewing  (in  which  both  of  the  observers'  eyes 
see  a  same  nondisparate  stimulus)  and  with  monocular  viewing  (in  which  one 
eye  is  covered  with  an  eye  patch).  In  general,  binocular  viewing  is  superior  to 
monocular  viewing  (compare  Figs.  4-8  and  Fig.  13)  by  a  factor  of  at  least  two. 
Theoretically,  however,  the  information  available  in  the  two  viewing 
conditions  is  identical  since  there  is  no  disparity  in  the  binocular  viewing 
condition  --  the  two  eyes  are  seeing  exactly  the  same  thing.  The  simple  feet 
of  binocularity  of  non  disparate  images,  therefore,  offers  nothing  in  the  way  of 
additional  stimulus  information  to  the  observer  that  is  not  in  the  single 
monocular  image.  Nevertheless,  we  have  determined  that  the  binocular 
viewing  condition  does  have  an  advantage  over  the  monocular  one.  This  may 
be  due  to  some  subtle  advantage  in  central  nervous  system  processing  that  is 
gained  when  the  images  from  the  two  eyes  are  identical.  In  other  words, 
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redundancy  itself  may  be  of  value.  However,  the  binocular  advantage  may 
also  arise  from  artifacts  of  far  less  theoretical  significance.  Such 
uninteresting  factors  as  simple  distraction  resulting  from  the  very  fact  of  the 
occlusion  of  vision  to  one  eye  or  even  the  presence  of  the  eye  patch  itself  may 
be  involved.  The  resolution  of  this  matter  is  left  to  others.  It  is  important  to 
us  only  to  note  that,  for  our  observers  and  in  this  kind  of  experiment,  the 
substantial  advantage  of  binocular  (as  opposed  to  dichoptic)  over  monocular 
viewing  is  an  empirical  fact. 

Another  outcome  of  the  experiments  reported  here  of  interest  in  our 
search  for  an  understanding  of  visual  perception  is  the  isotropic  nature  of 
visual  space  obtained  in  these  experiments.  There  is  no  difference  in  the 
detection  scores  in  Experiment  I  as  a  function  of  the  orientation  of  a  line  of 
simultaneously  presented  dots  nor  of  the  direction  of  the  trajectory  of  a 
sequence  of  dots  traced  out  so  slowly  as  to  produce  apparent  movement.  This 
insensitivity  to  orientation  and  direction  in  these  three  dimensional 
experiments  is  consistent  with  what  we  have  observed  in  two  dimensional 
space  for  similar  dotted  patterns  (Uttal,  1975),  but  inconsistent  with  what 
many  other  students  of  vision  had  previously  observed  for  continuous  stimuli 
(as  summarized  in  Appelle,  1972).  It  is,  therefore,  possible  that  the  lack  of 
continuity  of  the  dotted  stimulus  forms  we  use  is  a  special  property  and  the 
extrapolation  of  this  concept  of  isotropic  visual  space  to  continuous  stimuli 
would  be  inappropriate. 

One  can  speculate  why  this  difference  between  dotted  and  continuous 
forms  exists.  One  speculation  leads  to  the  suggestion  that  the  very  same 
attributes  that  produce  to  the  advantages  of  dotted  patterns  also  give  rise  to 
the  observed  differences  in  orientation  sensitivity.  Dots  are  isolated  entities 
both  in  the  mathematical  and  the  neurophysiological  senses;  they  are  not 
"connected"  to  other  dots  in  other  locations  in  the  field  of  view  in  the  same 
way  as  are  the  elements  of  a  continuous  form.  Rather,  we  see  dotted  forms 
because  of  their  global  arrangement.  Thus  each  dot  in  the  physical  stimulus, 
in  the  projection  on  the  retina,  and  perhaps  even  in  the  neural  networks 
representing  that  dot,  functions  discretely  and  independently.  These  discrete 
points  have  no  direction  or  orientation  of  their  own.  Only  the  global  pattern,  a 
property  that  is  properly  denoted  as  an  abstraction,  has  direction  and/or 
orientation.  But,  that  abstraction  has  no  physical  reality,  it  only  possesses  an 
intangible  organizational  reality.  Presumably  this  kind  of  form  is  so 
intangible  that  it  does  not  activate  the  same  mechanisms  as  do  physically 
continuous  stimuli.  It  is  for  this  reason  that  the  perception  of  dotted  patterns 
may  be  insensitive  to  direction  and  orientation,  in  a  manner  quite  different 
from  the  sensitivity  exhibited  in  the  perception  of  continuous  lines  and 
contours. 

The  next  point  in  our  findings  to  be  considered  concerns  the  difference 
in  detectability  of  dotted  lines  with  small  interstimulus  dot  intervals 
(perceptually  simultaneous)  and  lines  with  such  long  intervals  that  the 
sequential  nature  of  the  patterns  becomes  clear  and  apparent  motion  may  even 
be  experienced.  As  suggested  earlier,  one  a  priori  hypothesis  would  have 
suggested  that  the  apparent  motion  attributes  of  a  stimulus  might  at  least 
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partially  compensate  for  the  reduction  in  apparent  simultaneity.  However,  our 
data  provide  no  evidence  of  such  a  compensatory  effect.  The  greater  the 
interval  between  the  dots  of  the  stimulus  form,  the  less  detectable  the  forms 
are,  regardless  of  how  strong  the  perception  of  a  moving  trajectory  reported 
by  the  observer.  The  mechanism  that  detects  coherent  forms  among  dot 
patterns  is  better  able  to  process  information  when  it  is  presented 
simultaneously  than  when  it  is  distributed  in  time.  The  strong  effect  of 
interval  has  also  been  confirmed  in  two  dimensional  space  by  Falzett  and 
Lappin  (1981). 

It  was,  therefore,  a  somewhat  unexpected  outcome,  in  the  context  of 
the  extreme  sensitivity  to  average  temporal  interval  between  the  dots  just 
mentioned,  to  observe  that  the  mechanism  integrating  dots  into  forms  is 
virtually  insensitive  to  the  temporal  regularity  of  the  sequence  of  dots. 
Stimulus  lines  with  evenly  spaced  50  msec,  intervals  are  detected  only  slightly 
better  than  lines  with  highly  irregular  intervals.  This  insensitivity  to  temporal 
irregularity  exhibited  in  this  detection  task  is  also  surprising  in  the  context  of 
the  visual  system's  ability  to  detect  brief  gaps  in  a  train  of  otherwise  regular 
flashing  dots  (Uttal  and  Hieronymus,  1970). 

Even  more  surprising  was  our  subsequent  discovery  that  spatial 
irregularity  also  produced  only  a  minimal  effect  on  detection  scores  for  lines 
of  dots  plotted  at  intervals  that  wouid  be  expected  to  produce  apparent 
movement.  In  some  manner  the  visual  system  seems  to  smooth  over  both  the 
spatial  position  and  temporal  interval  irregularities  programmed  into  the 
stimulus  lines.  We  can  speculate  that  this  is  accomplished  by  the  same  kind  of 
mechanisms  that  are  well  known  to  account  for  path  smoothing  in  apparent 
motion  itself.  Classic  and  modern  studies  of  apparent  motion  have  indicated 
that  the  apparent  trajectory  tends  to  be  smoothed  in  such  a  way  that  the 
perceived  pathway  is  more  likely  to  reflect  a  good  form  (in  the  Gestalt  sense) 
than  the  actual  spatio-temporal  form  of  the  physical  stimulus.  This 
phenomenon  has  been  formalized  by  Foster  (1978)  into  a  theory  of  apparent 
motion  analogous  to  the  calculus  of  variations  used  in  mechanics.  In  his 
theory  "perceptual  forces"  are  minimized  just  as  are  physical  forces  in  the 
physicist's  calculus  of  variations.  Obviously  there  is  a  considerable  amount  of 
future  research  that  has  to  be  done  to  substantiate  and  understand  this 
surprising  result.  It  seems  to  us  particularly  important  that  the  experiment  be 
repeated  at  other  shorter  inter-target  dot  intervals  to  see  if  the  insensitivity 
to  spatial  irregularity  disappears  at  shorter  intervals  where  the  apparent 
motion  phenomenon  is  no  longer  involved. 

One  of  the  major  hypotheses  initially  motivating  this  study  was  our 
expectation  that  since  spatial  regularity  was  such  a  powerful  determinant  of 
detectability  in  two  dimensional  space,  so,  too,  should  be  spatial  and  temporal 
irregularity  in  three  dimensional  space.  It  was  on  this  basis  that  we 
anticipated  that  any  future  three  or  four  dimensional  mathematical  model  of 
the  processes  we  are  studying  here  that  is  similar  in  concept  to  the  two 
dimensional  autocorrelation  theory  would  be  extremely  demanding  of 
computer  time  for  its  evaluation.  However,  if  this  initial  finding  of 
insensitivity  to  both  spatial  and  temporal  irregularity  is  generalizable  to  other 
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conditions  the  model  towards  which  we  are  working  may  be  considerably 
simpler  than  we  had  anticipated. 

In  summary,  our  experiments  describe  a  visual  mechanism  that  has 
some  extraordinary  powers.  The  system  seems  to  be  extremely  sensitive  to 
mean  spatial  and  temporal  intervals.  However,  both  spatial  and  temporal 
interval  irregularity  seem  not  to  influence  the  kind  of  detection  task  we  are 
using  here  when  dot  intervals  are  large  enough  to  create  apparent  motion. 
This  surprising  outcome  may  be  explained  by  the  same  sort  of  mechanisms  that 
account  for  apparent  motion,  a  phenomenon  in  which  discrete  and  intermittent 
stimuli  are  perceived  as  smooth  and  continuous  under  the  control  of 
constructive  mental  process  whose  origins  and  mechanisms  remain  almost 
totally  unknown. 


Possible  Applications. 


Above  and  beyond  whatever  contribution  our  study  makes  to 
understanding  the  fundamental  nature  of  visual  perception,  we  believe  that  it 
may  also  have  some  useful  and  practical  applications  to  other  display-related 
problems.  These  potential  applications  emerge  both  from  the  technology  that 
we  have  used  to  instrument  these  excursions  in  basic  perceptual  science  and 
from  the  results  we  have  obtained  in  these  studies.  In  gene-al,  the  three 
dimensional  display  methodology  offers  a  means  by  which  the  computer  can 
preprocess  and  integrate  multidimensional  visual  information  rather  than 
imposing  this  processing  load  on  the  observer.  The  use  of  the  computer  to 
graphically  display  the  three  dimensions  of  spatial  information  in  this  manner 
makes  for  a  much  more  realistic,  direct,  and  compatible  relationship  than  that 
obtainable  with  a  plan  position  indicator  (PPI)  display.  Not  only  is  the 
relationship  between  the  real  environment  and  the  display  improved,  but  also 
the  directness  of  the  relationship  between  the  observer’s  percept  and  the 
environment.  To  do  otherwise  loads  the  observer  with  an  information 
processing  task  much  better  accomplished  by  the  computer.  The  end  result  of 
using  a  two  dimensional  representation  of  the  three  dimensional  world  is  to 
distract  the  observer's  attention  from  the  tasks  he  can  perform  better  than  the 
computer. 


Perhaps  there  is  no  clearer  instance  of  the  urgent  need  for  a  three 
dimensional  (rather  than  a  two  dimensional)  presentation  of  spatial 
information  than  in  the  volumetric  environment  exemplified  by  either  the  air 
or  undersea  traffic  control  situations.  Merging  polar  coordinate  height  (or 
vertical  depth)  information  and  plan  position  information  from  two  RADAR 
systems  is  a  relatively  easy  computational  excercise.  The  techniques  used  in 
our  experiments  suggest  that  it  would  be  no  great  technological  feat  to 
present  that  information  stereoscopically.  We  are  convinced  that  there  are  no 
computational  or  instrumentation  difficulties  in  implementating  such  a  device. 
The  question  then  is,  would  such  a  device  provide  enough  advantages  to 
warrant  the  cost  and  effort  of  its  development?  While  a  full  answer  to  this 
question  can  only  be  found  ip.  the  laboratory,  it  seems  clear  that  the  possible 
reduction  in  information  processing  load  required  of  the  observer  in  these 
traffic  control  situations  would  be  significant. 
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We  should  also  mention  that  the  particular  technique  we  use  here  to 
construct  three  dimensional  experiences  is  not  unique.  Our  strategy  is  only 
one  possible  alternative  approacn.  Others  have  been  offered  including  Bolt 
Beranek  and  Newman's  SpaceGraph  (Sher,  1979;  Huggins  and  Getty,  1981), 
hologram  type  devices,  and  Shetty,  Brodersen,  and  Fox's  (1979)  anaglyphic 
method.  Of  course,  certain  advantages  and  disadvantages  are  characteristic 
of  each  device;  each  may  fill  some  need  better  than  the  others. 

As  noted,  the  general  advantage  that  ail  such  three  dimensional  systems 
possess  is  that  they  drastically  reduce  the  information  processing  load  on  the 
observer.  Rather  than  observing  displays  with  altitudes  marked  in  numeric 
codes  near  two  dimensionally  located  targets,  the  observer  would  be 
confronted  with  an  apparent  three  dimensional  display  in  which  accessory 
numeric  information  representing  height  need  not  be  separately  processed. 
Furthermore,  proximity  would  be  much  more  directly  evident,  less  dimensional 
recoding  would  be  required,  and  the  task  would,  therefore,  be  less  stressful, 
less  demanding,  and  require  much  less  operator  training  to  achieve  a  given 
level  of  competence  than  would  the  conventional  two  dimensional  displays  now 
in  use.  We  believe  that  the  advantages  of  such  a  system  would  be  profound.  In 
sum,  these  advantages  include: 

(1)  Improved  observer  reaction  times. 

(2)  Reduced  observer  information  processing  load. 

(3)  Enlarged  traffic  capacity. 

(4)  Reduced  observer  training  requirements. 

(3)  Increased  conspicuity  of  hazardous  conditions. 

Using  modern  computer  graphic  devices,  other  useful  attributes  can 
also  easily  be  designed  into  such  a  display  system.  For  example,  proximity 
could  be  coded  by  color  in  a  way  that  would  very  conspicuously  indicate  the 
imminence  of  dangerous  traffic  conditions.  Trajectory  extrapolation 
information  could  also  easily  be  added  to  the  stereoscopic  display  to  indicate 
where  future  difficulties  may  be  developing.  In  addition,  a  joystick  controlled 
cursor  could  be  added  so  that  individual  targets  could  be  located  in  the  three 
dimensional  space  of  the  stereoscopic  display;  the  three  spatial  coordinates  so 
targeted  could  then  be  displayed  on  digital  readouts.  Alphanumeric 
information  could  be  plotted  on  the  display  as  well.  Figure  17  is  a  two 
dimensional  projection  of  a  hypothetical  three  dimensional  display  presented 
to  give  a  more  graphic  idea  of  the  sort  of  device  we  imagine  would  be  useful  in 
the  traffic  control  situation. 

It  is  our  expectation  that  such  a  three  dimensional  device  would  be  easy 
to  use,  would  require  less  training,  and  would  provide  higher  margins  of  safety 
in  critical  control  situations.  However,  a  considerable  amount  of  empirical 
research  is  necessary  to  validate  these  presuppositions.  In  particular  it  seems 
necessary  and  appropriate  to  determine  the  limits  of  depth  discrimination  that 
are  achievable  with  this  type  of  display  as  well  as  the  precision  with  which  the 
three  dimensional  cursor  can  be  used  to  determine  the  position  of  objects  in 
the  stereoscopically  generated  space.  Determining  the  conspicuity  of  color 
and  the  ability  of  the  observer  to  use  three  dir  ansional  trajectory 
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extrapolation  are  among  the  many  other  perceptual  experiments  that  would 
have  to  be  carried  out  to  fully  evaluate  the  advantages  and  disadvantages  of 
these  devices. 

A  second  area  of  application  has  been  suggested  to  us  by  T.  Uttal 
(1982).  Three  dimensional  displays  of  the  kind  we  propose  here  would  be  of 
enormous  help,  she  suggests,  in  reducing  the  complex  data  now  obtained  in 
studies  of  atmospheric  physics.  At  present,  inadequate  attention  has  been  paid 
to  the  use  of  dynamic,  three  dimensional  graphic  displays  in  this  important 
area  of  science.  Since  time  plays  a  particularly  important  role  in  this 
application,  the  possibility  of  animated  displays,  perhaps  recalling  previous 
sequences  of  atmospheric  activity,  is  an  exciting  concept.  Stereoscopic 
displays  may  provide  a  way  to  reduce  the  large  scale  computing  requirements 
in  atmospheric  physics  by  substituting  the  powerful  integrative  abilities  of  the 
human  mind  for  the  ponderous  parallel  numerical  calculations  required  in 
complex  fluid  dynamics  problems.  The  idea  of  some  future  meteorologist 
studying  a  recurrently  recalled  record  of  the  spatial  and  temporal  history  of  a 
storm  is  an  intriguing  idea. 

A  third  area  of  application  of  our  findings  may  lie  in  two  dimensional 
dynamic  graphics.  We  also  believe  that  the  insensitivity  to  temporal  and 
spatial  irregularity  that  we  have  discovered  for  moving  dots  may  have 
important  implications  for  the  design  of  future  video  displays.  Future  digital 
displays  are  likely  to  exhibit  some  of  these  characteristic  distortions.  Since  it 
seems  likely  that  these  irregularities  may  be  undetectable  in  the  trajectories 
of  moving  objects,  vast  savings  in  engineering  time  and  costs  are  likely  if  we 
determine  the  thresholds  of  visibility  of  those  distortions. 

To  conclude  this  brief  note  on  future  applications,  we  should  note  that 
there  have  already  been  many  developments  comparable  to  the  ones  proposed 
here  outside  of  traffic  control,  atmospheric  physics,  and  video  displays. 
Chemists  routinely  look  at  the  three  dimensional  shapes  of  molecules  and 
neuroanatomists  have  applied  similar  techniques  to  study  brain  structure.  It  is 
somewhat  surprising  that  the  potential  applications  we  have  noted  here  have 
not  yet  been  the  targets  of  similarly  intense  implementation  efforts. 
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FOOTNOTES 


1.  This  project  is  supported  by  Contract  //N00014-81-C-G266  from  the 
office  of  Naval  Research,  Alexandria,  Virginia.  We  are  especially 
appreciative  of  the  cooperative  support  of  the  science  officer  Dr.  John 
O'Hara. 

2.  We  express  our  appreciation  to  Cheryl  Slay  whose  editorial  and  typing 
skills  made  this  a  far  better  document  than  it  would  otherwise  have 
been. 

3.  Earlier  studies  in  our  laboratory  had  suggested  that  a  large  proportion 
of  possible  observers  were  stereoanomalous.  This  anecdotal  evidence 
was  supported  by  Richards'  (1970)  contention  that  approximately  30%  of 
the  population  may  be  deficient  to  at  least  some  degree.  A  follow-up 
study,  carried  out  in  our  lab  by  Millicent  Newhouse  --  our  laboratory's 
ONR  science  apprentice  —  has  shown,  however,  that  only  1  in  100 
randomly  sampled  Ss  was  actually  stereodeficient  when  carefully  tested 
with  an  anaglyphic  screening  procedure.  Patterson  and  Fox  (1981)  have 
also  recently  reported  the  same  low  level  of  stereoanomaly  in  the 
general  population. 

4.  It  is  interesting  to  note  that  the  transformation  of  the  X,Y,Z,t  internal 
representation  in  the  computer  to  the  X,  ,Yj_,t  and  Xo,YR,t 
representation  on  the  face  of  the  oscilloscope  is  the  inverse  of  wnat  tne 
visual  system  does  when  it  coverts  the  haploscopic  images  (XL,Y^,t  and 
Xn,Y0,t)  into  an  illusion  of  solidity.  In  neither  case  does  a  solid 
actually  exist  in  three  space,  however.  Certainly  in  the  computer  and 
probably  in  the  "mind",  volumes  are  "represented"  in  what  is  best 
described  as  a  symbolic  code. 

5.  To  achieve  this  high  speed  conversion,  we  had  to  modify  the  delivered 
system  by  removing  capicitors  CIO,  Cl 3,  Cl 6,  C19  from  the  four  digital 
to  analog  converter  output  stages. 
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Figure  1.  Four  sample  dotted  line  stimulus  forms  in  different  levels  of  dotted 
visual  noise.  (A)  3  noise  dots/sec;  (B)  20  noise  dots/  sec;  (C)  50  noise  dots/sec; 
(D)  100  noise  dots/sec.  The  noise  dots  and  the  stimulus  form  dots  in  the  actual 
stimulus  display  may  be  distributed  anywhere  within  the  one  second 
presentation  duration.  These  still  photographs  obscure  the  dynamic  quality  of 
the  display. 
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Figure  2.  The  analog  subsystem  of  the  hybrid  computer.  These  components 
generate  the  stereoscopic  displays.  The  OEI  units  (Mfd.  by  Optical  Electronics 
Inc.,  Tuscon,  Arizona)  are  interconnected  by  a  passive  network  designed  by  the 
manufacturer.  This  system  transforms  the  digital  signals  from  the  Cromemco 
System  III  microcomputer  into  analog  voltages  to  control  the  plotting  of  the 
dichoptic  images  in  real  time  without  a  prolonged  period  of  digit? 1 
computation.  (Abbreviations  on  the  OEI  modules  are  designated  in  the 
manufacturer's  manual.) 
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Figure  4.  The  results  of  Experiment  1  for  noise  densities  of  125  dots/sec.  The 
horizontal  axis  indicates  the  duration  of  each  of  the  equal  intervals  between 
successive  dots.  The  three  curves  are  for  dichoptic  and  left  and  right  eye 
monocular  viewing  respectively.  The  vertical  axis  indicates  the  pooled 
average  of  all  observers'  scores  for  this  condition. 
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Figure  6.  The  results  of  Experiment  I  for  noise  densities  of  250  dots/sec. 
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Figure  9.  The  results  of  Experiment  I  reanalyzed  to  display  the  negligible 
effect  of  track  direction  (for  noise  dot  densities  of  166  dots/sec.) 
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Figure  10.  The  results  of  Experiment  II  in  which  the  temporal  intervals 
between  successive  dots  of  the  straight  line  stimulus  were  made  progressively 
more  irregular.  There  is  but  the  slightest  effect  of  interval  irregularity,  if 
any,  on  detecxability.  Only  the  dichoptic  condition  was  run  in  this  experiment. 
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Figure  U.  The  result  of  Experiment  III  in  which  the  spatial  intervals  between 
successive  dots  were  made  progressively  more  irregular.  These  data  indicate 
virtually  no  sensitivity  to  spatial  irregularity  when  there  is  a  50  msec  interval 
between  dots.  As  discussed  in  the  text,  no  meaningful  units  can  be  attached  to 
the  values  of  the  horizontal  coordinate. 


Figure  12.  A  graphic  depiction  of  the  seven  positions  in  which  the  flashing  dot 
used  in  Experiment  IV  might  be  located  in  any  trial.  The  flashing  dot  stimulus 
was  placed  in  only  one  of  these  dot  positions  in  each  trial. 
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Fieure  14.  The  results  of  Experiment  IV  plotted  as  a  function  of  the  position 
of  the  flashing  dot.  The  numbers  on  the  horizontal  axes  are  keyed  to  the 

positions  shown  in  Fig.  12. 
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Figure  15.  The  results  of  Experiment  IV  plotted  as  a  function  of  the  length  of 
the  interval  between  flashes  with  data  pooled  from  all  noise  levels.  Only  the 
dichoptic  condition  was  run  in  this  experiment. 
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Figure  16.  The  results  of  Experiment  V  in  which  the  interval  between  flashing 
dots  was  made  irregular.  Data  are  plotted  for  both  the  dichoptic  and 
monocular  viewing  conditions. 
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Figure  17.  A  two  dimensional  projected  drawing  of  a  proposed  three 
dimensional  traffic  controller  display.  The  apparent  cube  models  the  true  air 
or  sea  space.  marks  indicate  vehicle  current  positions.  Lines  indicate 
extrapolated  trajectory.  Such  a  device  would  be  easy  to  build  and  might  have 
substantial  perceptual  advantages  over  two  dimensional  displays  since 
preprocessed  height  and  plan  information  is  integrated  by  the  computer  prior 
to  display.  The  observer's  task  is  thus  greatly  reduced  in  complexity. 
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Part  II:  Panel  Discussion  — 

Critical  Research  Issues  in 
3-D  Displays 


? 


CRITICAL  RESEARCH  ISSUES  ON  COCKPIT  APPLICATIONS  OF  3-D  DISPLAYS 


Kenneth  R.  Boff,  Ph.D. 

Air  Force  Aerospace  Medical  Research  Laboratory 
Wright-Patterson  Air  Force  Base 


Introduction 

Today's  operational  aircrews  continue  to  experience  workload 
saturation  despite  the  infusion  of  new  display  and  data  handling 
technologies.  Part  of  the  reason  for  this  lies  with  the  overwhelming 
volume  of  displayed  visual  information  which  competes,  at  any  given 
moment,  for  the  pilot's  attention.  Design  decisions  regarding  when  and 
how  (i.e.,  digital,  symbolic  or  pictorial)  this  information  should  be 
portrayed  and  its  spatial  temporal  configuration  can  account  for  a 
significant  measure  of  variance  with  respect  to  the  operator's  ability 
to  acquire  and  process  task  critical  information. 

The  Human  Engineering  Division  of  the  Air  Force  Aerospace  Medical 
Research  Laboratory  (AFAMRL)  is  engaged  in  exploratory  research  to 
support  development  of  a  pilot-centered  cockpit  design  technology. 
This  involves  the  development  of  sound  theoretical  and  empirical  bases 
for  matching  the  perceptual  and  psychomotor  characteristics  of  the 
aircrew  with  the  design  of  controls,  displays  and  approaches  for 
portrayal  of  information  within  the  cockpit. 

Applications  of  the  three-dimensional  (3-D)  presentation  of  in¬ 
formation  which  exploit  the  human's  highly  refined  and  well  practiced 
sense  of  depth  have  been  considered  for  their  potential  in  facilitating 
the  transfer  of  information  in  future  aircrew  cockpits.  One  appli¬ 
cation,  suggested  by  Furness  (1981)  involves  the  use  of  3-D  in  an 
integrated  tactical  display  as  a  means  for  a)  providing  the  aircrew  with 
a  spatial  analog  of  objects  and  events  occurring  in  real  3-D  space  and, 
b)  configuring  information  to  reduce  apparent  clutter.  Figure  1  shows 
a  conceptual  representation  of  an  integrated  tactical  display  which 
combines  information  from  a  range  of  different  sources  into  a  single 
pictorial  output..  In  this  example,  information  is  presented  to  the 
pilot  in  a  full  field  hemispherical  display.  Information  may  be 
accessed  along  the  line  of  sight  within  an  instantaneous , binocular 
field  of  regard  that  can  freely  search  the  total  field  of  view.  This 
display  combines  information  from  instruments,  sensors,  and  data  links 
from  other  airborne  and  ground-based  sources.  It  presents  these  data  in 
a  hybrid  literal/symbolic  package  to  provide  the  aircrew  with  a  ''path¬ 
way  in  the  sky"  presentation  complete  with  threat  warning  and  various 
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cues  to  situation  awareness.  However,  before  the  three-dimensional 
presentation  of  information  can  be  seriously  considered  for  future 
implementation  in  integrated  cockpit  displays,  a  number  of  critical 
research  issues  need  resolution. 

Critical  Research  Issues 

A.  EFFECTIVENESS  OF  3-D  AS  A  MEDIUM  FOR  INFORMATION  TRANSFER. 

The  relative  effectiveness  of  3-D  displays  as  compared  with 
encoded  volumetric  information  in  two-dimensional  (2-D)  presentations 
neads  to  be  determined.  Over  the  past  thirty  years,  studies  have  been 
conducted  comparing  various  parameters  of  visual  performance  (e.g., 
target  acquisition,  estimation  of  relative  and  absolute  position  or 
velocity  of  targets,  etc.)  and  various  2-D  encoding  techniques  versus 
i-D  presentations  (Kennedy  and  LaForge,  1958;  Guttman  and  Anderson, 
1962;  Bassett,  Kahn,  LaMay,  Levy,  and  Page,  1965;  and  others). 

For  the  most  part,  these  studies  did  not  find  evidence  to  support 
a  hypothesis  of  improved  visual  performance  for  various  applications  of 
3-D  displays.  Careful  review  of  these  past  studies  suggests  many 
methodological  difficulties  which  raise  questions  as  to  the  validity  of 
the  results.  Potential  problems  were  identified  ranging  from  display 
approach,  stimulus  content  and  configuration,  and  the  prior  experience 
and  training  of  subjects.  Nevertheless,  this  line  of  research  has 
continued  based  in  part  on  the  assumptions  that  a  3-D  presentation 
should  involve  "less  mental  computation"  than  a  coded  2-D  presentation 
(Guttman  and  Anderson,  1962)  and  that  the  "natural  ability"  to 
discriminate  the  relative  spatial  orientation  and  range  of  object c  in 
visual  depth  is  not  exploited  by  viewing  2-D  displays  (Leibowitz  and 
Sulzer,  1965;  Abbott,  Higgens,  Strotter,  and  Upton,  1971).  Therefore, 
further  research  is  needed  to  determine  the  specific  conditions  under 
which  3-D  presentations  can  enhance  an  operator's  ability  to  acquire 
and  process  task  critical  information.  Any  observed  increments  or 
decrements  in  performance  need  to  be  evaluated,  in  turn,  with  respect  to 
the  effects  of  depth  cue  conflicts  resulting  from  artifacts  of  the 
display  approach,  requirements  for  depth  cue  redundancy,  and  effects  of 
information  complexity  and  clutter.  Another  area  which  needs  to  be 
empirically  addressed  is  the  information  transfer  effectiveness  of 
depth  codes  for  non-spatial  information  such  as  time  (e.g.,  immanence) 
or  some  other  assigned  parameter  (e.g.,  lethality  of  threat)  and  the 
extent  to  which  depth  encoded,  non-spatial  information  may  be  used  to 
organize  or  declutter  displays  (Lehmkuhle  and  Fox,  1980).  Addi¬ 
tionally,  the  effects  on  operator  performance  of  co-locating  depth 
encoded  non-spatial  information  and  spatial  analog  information  within 
the  same  display  or  instrument  console  needs  tc  be  determined. 

Research  is  also  required  to  evaluate  the  impact,  if  any,  of  3-D 
presentations  on  the  overall  processing  capability  of  the  system 
operator.  If,  on  one  hand,  a  3-D  presentation  demands  less  processing 
capacity  than  an  information  equivalent  2-D  encoded  presentation,  then 
whether  use  cf  3-D  displays  enhances  the  total  amount  of  information 
which  the  operator  can  perceive  or  attend  to  at  any  given  moment  needs 
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to  be  determined  (Miller,  1956;  Leibowitz  and  Su)zer,  1965).  On  the 
other  hand,  if  the  ability  to  perceive  information  presented  in  3-D  is 
enhanced  in  terms  of  speed  or  reduced  error  rate  without  a  corresponding 
increase  in  total  channel  capacity,  then  3-D  presented  information  of 
low  criticality  could  interfere  with  the  operator's  ability  to  acquire 
or  process  more  critical  items  of  information  (Hill  and  Self,  1961). 

B.  THREE-DIMENSIONAL  DISPLAY  QUALITY  REQUIREMENTS  FOR  EFFICIENT 
OPERATOR  INTERFACE. 

For  future  applications,  it  will  be  necessary  to  establish  psy- 
chophysically  based  criteria  and  specifications  for  the  "packaging"  of 
information  in  3-D  displays.  More  specifically,  the  following  are  of 
concern:  absolute  versus  relative  values  (e.g.,  across  channels  in  a 
two  eye  display)  and  required  system  tolerances  for  luminous  intensity, 
color,  disparity,  distortion,  etc.  with  respect  to  the  operator's 
sensory  limitations,  factors  leading  to  early  fatigue  or  stress,  and 
individual  differences  among  perspective  users.  Most  of  the  existing 
psychophysical  data  germane  to  the  3-D  presentation  of  information  bear 
on  these  issues  (Ogle,  1950;  Farrell  and  Booth,  1975). 

Future  consideration  of  an  operational  flight  display  which 
utilizes  a  3-D  presentation  will  require  resolution  of  these  and  other 
unstated  critical  research  issues.  In  part,  some  of  these  issues  can  be 
addressed  with  relatively  simple  stimuli  and  conventional  method¬ 
ologies.  However,  exploratory  research  in  a  workload  constrained 
environment  is  necessary  to  evaluate  and  validate  the  advantages  of 
3-D.  Investigation  of  these  issues  will  take  place  using  the  AFAMRL 
visually  coupled  airborne  systems  simulation. 


!  j 
i ; 


i  = 


?  I 


H 


t  * 


!  I 
II 


Visually  Coupled  Airborne  Systems  Simulator  (VCASS) 

Since  1966,  the  AFAMRL  has  been  developing  a  new  technology  for 
coupling  pilot/crew  member  visual  input  into  aircraft  systems.  Col¬ 
lectively  termed  "visually-coupled  systems",  this  new  technology  takes 
advantage  of  the  precision  with  which  a  crew  member  can  aim  his  head  and 
direct  his  gaze.  In  essence,  the  interface  between  the  crew  member  and 
the  aircraft  systems  is  brought  about  through  the  communication  of  head 
position  (and,  consequently,  eye  position)  coordinates  in  order  to 
designate  targets,  slew  weapons  or  sensors,  or  activate  switches.  A 
feedback  presentation  of  information  is  also  provided  within  the 
operator's  field  of  vision  regardless  of  head  position.  The  two 
subsystems  which  comprise  visually-coupled  systems  are  the  helmet- 
mounted  sight  (HIS)  which  provides  line-of-sight  data  and  the  helmet- 
mounted  display  (HMD)  which  provides  a  virtual  collimated  image 
presentation  of  information  (Birt  and  Furness,  1974;  Kocian,  1977; 
Furness,  1980;  Task,  Kocian,  and  Brindle,  1980). 

In  1976,  AFAMRL  began  the  Visually-Coupled  Airborne  Systems 
Simulator  (VCASS)  program  to  exploit  the  advantages  or.  the  helmet- 
mounted  sight/display  for  visual  scene  simulation.  Figure  2  shows  the 
conceptual  operation  of  the  VCASS.  A  helmet -mounted  display,  modified 
to  provide  a  wide  f ield-of-view  (variable  from  100-140  degrees) 


ii 

if 


i-i 

t  .<=' 

|  ! 
ij 


131 


Boff 


binocular  presentation,  provides  an  instantaneous  binocular  visual 
field  of  view  selected  from  an  overall  computer  generated  scene  using 
helmet  position  and  attitude  determined  by  a  six-degree-of-freedom 
helmet-mounted  sight  (Fig.  3).  In  addition  to  the  outside  scene,  head- 
up  display  information  and  synthesized  virtual  cockpit  instruments  can 
be  displayed  at  appropriate  locations  in  space. 

The  VCASS  display  is  a  complete  two  eye  system  with  separate 
cathode  ray  tubes  (CRTs)  feeding  information  to  each  monocular.  The 
monoculars  are  overlapped  in  angular  space,  permitting  a  continuous 
presentation  to  both  eyes  and  allowing  3-D  information  to  the  observer 
in  the  overlapped  pattern  of  the  display.  Table  1  provides  detailed 
specifications  of  the  Laboratory  VCASS. 

The  VCASS  display  provides  a  vehicle  for  exploratory  investigation 
of  critical  human  factors  issues  in  the  presentation  and  utilization  of 
information  preserved  in  3-D  space.  Use  of  the  helmet-mounted  display 
in  synchrony  with  a  simulator  cockpit,  affords  a  unique  capability  for 
testing  and  evaluating  integrated  3-D  displays,  such  as  that  illus¬ 
trated  in  Figure  1,  in  a  flight-task  loaded  environment. 
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Figure  2.  Conceptual  operation  of  the  Visually-Coupled 
Airborne  System  Simulator  (VCASS) . 


VCASS  HELMET  MOUNTED  SIGHT 
AND  DISPLAY  SYSTEM 


Figure  3.  VCASS  helmut-mounted  sight  and  display  system 
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TABLE  1 


Laboratory  VCASS  Performance 

Head  position  sensor 

Attitude/position  sensing 

6  DOF 

Allowable  head  movement 

6.25  cubic  feet 

Accuracy 

0.2°  CEP 

Angular  resolution 

0.03° 

Update  rate 

100  HZ 

Optical 

Optical  design 

binocular/color 
corrected  (infin¬ 
ity  collimated) 

Field-of-view 

horizontal 

100-140  degrees 

vertical 

60  degrees 

overlap 

20-60  degrees 

Exit  pupil 

15  mm 

Transmission  (CRT  to  eye) 

0.8  current 

(ambient  to  eye) 

7.0  current 

Optical  transfer  function 

60  LP/mm  @94% 
modulation  (on 
axis) 

Distortion 

.002 

c: 
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A  STEREO-RANGEFINDER  EXPERIENCE 


George  S.  Harker,  Ph.D. 
Performance  Research  Laboratory 
University  of  Louisville 


I  would  like  to  review  for  you  work  with  which  I  was  associated  in 
the  1950s  at  Fort  Knox,  Ky.  At  the  time,  the  Army  was  concerned  to  in¬ 
corporate  a  stereo-rangefinder  in  the  fire  control  system  of  the  tank 
(1).  The  seeming  straight  forward  design  task  was  almost  completely 
dominated  by  tactical  and  engineering  considerations.  In  the  end, 
stereo  was  abandoned  for  a  coincidence  task.  It  would  be  nice  to  con¬ 
clude  that  recognition  of  human  factors  early  on  would  have  saved  the 
project.  However,  experience  with  stereo-displays  has  indicated  that 
stereo-ranging  is  inappropriate  against  ground  targets. 

The  activity  in  which  I  took  part  sought  to  answer  the  question: 

Can  armored  personnel  of  class  A  physical  profile  operate  a  stereo¬ 
rangefinder  against  ground  targets  to  an  expectation  of  80%  first  round 
hits?  To  speak  to  the  question,  range  readings  were  taken  with  one 
meter  base.  Navy,  stereo-rangefinders  modified  to  incorporate  the  Army 
"flying  geese"  reticle.  The  optics  of  these  instruments  were  uncompli¬ 
cated  and  balanced  before  the  two  eyes.  The  reticle  to  each  eye  was  in¬ 
serted  independently.  An  internal  corrector  in  the  left  eye  system  was 
used  to  bring  the  reticles  into  zero  registry  and  presumably  to  remove 
operator  bias.  The  ranging  wedge  was  in  the  right  eye  system.  The  re¬ 
sultant  asymmetrical  vergence  caused  the  path  of  the  flying  geese  to  be 
diagonal  from  near  left  to  far  right. 

Analysis  of  the  range  readings  concentrated  on  variability  with  the 
expectation  that  localization  error  would  be  handled  by  a  one  time,  zero 
adjustment.  The  newly  designed  Army  instruments  were  to  have  auto-col- 
limation  in  the  reticle  system  which  would  eliminate  operator  adjustment 
of  the  internal  corrector  and  reduce  variability  by  a  factor  of  two. 
Operator  performance  with  the  Navy  instrument  seemed  to  indicate  that 
the  80%  criterion  could  be  achieved  easily  by  90%  of  the  Amy  popula¬ 
tion.  However,  practice  ranging  did  not  show  the  expected  incremental 
increase  in  precision.  Rather,  individuals  who  started  out  doing  well 
continued  to  do  well  and  individuals  who  did  poorly  continued  to  do 
poorly.  Occasionally,  for  no  obvious  reason,  an  individual  would  shift 
from  one  group  to  the  other.  It  was  as  though  stereo  ability  was  a 
given  and  the  variablitiy  of  ranging  reflected  the.  individual’s  atten¬ 
tion  to  the  task.  Given  these  findings,  two  thousand  range  settings  was 
fixed  as  an  arbitrary  requisite  to  qualify  a  range-finder  operator. 

In  another  phase  of  the  effort,  all  available  stereoscopic  vision 
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tests,  some  30  in  number,  were  assembled  with  the  help  of  the  National 
Research  Council  -  Armed  Forces  Vision  Committee.  The  objective  was  to 
identify  an  appropriate  selection  device.  The  rationale  of  the  project 
called  for  choosing  a  device  that  loaded  significantly  on  a  stereo¬ 
factor  to  be  defined  by  factor  analysis  with  rotation  to  simple 
structure.  Some  tests  were  dynamic,  requiring  the  subject  to  stop  a 
cycling  display.  One  such  test  presented  a  line  inclined  in  depth  which 
was  to  be  stopped  as  it  passed  through  vertical.  Other  tests  required 
the  subject  to  make  an  adjustment  to  equidistance  as  with  the  Howard- 
Dolman  or  a  rangefinder.  The  largest  group  of  tests  were  variations 
on  the  familiar  Wheatstone  stereogram.  These  tests  variously  included, 
in  conjunction  with  the  requisite  disparity,  size  cue,  color,  realistic 
field  of  view,  etc.  A  group  of  200  or  more  enlisted  men  was  processed 
through  the  vision  tests. 

The  results  of  the  factor  analysis  were  disappointing.  A  stereo¬ 
factor  if  identified  was  minimal  in  its  loading  on  the  tests.  Rather, 
the  data  fell  into  groups  by  the  type  of  judgement  required  of  the  sub¬ 
ject  and  the  mechanics  of  the  test  device.  In  the  absence  of  a  test 
that  could  be  characterized  as  uniquely  "stereoscopic,"  the  rangefinder 
was  chosen  ac  the  selection  device.  Thus,  selection  and  training  of 
stereo  operators  was  to  be  accomplished  concomittantly  with  instruction 
in  the  detail  and  use  of  the  fire  control  system. 

With  delivery  of  a  one  meter  Army  instrument  which  mounted  in  the 
nose  of  the  tank  turret,  the  project  began  to  fall  apart.  The  number 
of  men  who  could  operate  in  stereo  dropped  precipitously.  Auto-collima- 
tion  may  have  been  a  plus,  but  provision  of  -alternate  sighting  systems 
to  handle  combat  eventualities  had  multiplied  the  number  of  elements  in 
the  optical  paths  to  a  point  that  binocular  vision  was  all  but  impossi¬ 
ble.  Any  semblance  of  balance  in  the  optical  paths  was  gone  to  include 
a  golden  tint  from  a  partial  mirror  which  appeared  in  the  left  eye 
system  only.  A  second  Army  instrument  of  one  and  a  half  meter  base  was 
designed  to  be  mounted  across  the  turret  about  midway  back.  The  latter 
instrument  was  balanced  for  number  of  optical  elements  but  was  asymmet¬ 
rical  for  base  and  though  considerably  simpler  than  the  first  Army 
Instrument  was  still  more  complex  than  the  Navy  instrument.  In  the 
end,  less  than  forty  percent  of  the  Army  population  could  work  in 
stereo  and  there  was  serious  doubt  that  range  readings  to  any  selection 
of  targets  could  be  zeroed  by  a  single  adjustment  for  operator  bias. 

In  working  with  these  instruments,  the  flying  geese  all  too  frequently 
appeared  as  fence  posts  superimposed  upon  the  terrain,  a  dead  give  away 
that  the  operator  was  perceiving  at  least  the  reticles  monocularly. 
Ultimately,  all  instruments  were  modified  to  a  full  field,  superimposed 
image  viewed  monocularly.  To  range,  one  eliminated  double  images  of 
the  target  in  the  presence  of  double  images  of  the  rest  of  the  field  of 
view. 

The  point  in  reviewing  this  effort  to  utilize  stereoscopic  vision 
in  a  seemingly  straight  forward  application  is  to  bring  the  experience 
gained  into  current  time.  It  should  be  noted  that  tactical  and  engi¬ 
neering  considerations,  not  human  factor,  guided  the  design  process 
because  there  was  little  human  factors  information.  The  consequences 
for  stereo  of  the  various  compromises  necessary  to  production  of  these 


140 


Harker 


instruments  were  not  known. 

As  part  of  a  program  to  study  the  human  factors  questions  evident 
with  the  rangefinders,  e.g..  What  is  the  effect  on  performance  of  asym¬ 
metrical  reticle  movement?,  What  is  the  relation  of  rate  of  range-knob 
rotation  to  seen  movement  in  the  stereo  reticle?,  etc.,  a  laboratory 
instrument  was  construcced.  The  stereoptometer  (2)  consisted  of  two 
reflex  sights  each  of  which  delivered  a  reticle  from  a  reflection  plate 
to  one  eye  of  the  operator.  The  operator's  interpupillary  distance  was 
the  base  of  the  instrument.  The  angle  between  the  parallel  beams  from 
Lhe  reflex  sights  measured  the  range.  A  circle  or  dot  of  light  served 
as  the  reticle.  To  use  the  instrument,  the  operator  adjusted  the  ver- 
gence  of  the  reflex  sights  to  place  the  fused  reticle  at  the  distance 
of  a  designated  target  in  the  immediate  environment  seen  through  the 
reflection  plates. 

Figure  1A  and  B  present  stereo  acuities  taken  with  the  stereopto¬ 
meter  for  two  groups  of  enlisted  men.  The  target,  a  white  dowel  h  inch 
in  diameter  was  302  cm  distance  from  the  operator.  The  acuities  in 
seconds  of  arc  are  the  standard  deviation  of  twelve  range  settings. 

Most  of  the  acuities  are  below  a  criterion  of  one  minute  of  arc.  One 
U.O.E.  (12  sec.  of  arc)  was  the  performance  desired  of  trained  Army 
range-finder  operators.  Figure  IB  illustrates  the  change  in  distribu¬ 
tion  of  acuities  before  and  after  five  weeks  of  training  or  two  thou¬ 
sand  range  finder  settings.  The  effect  was  rather  to  increase  the  sep¬ 
aration  of  the  poorest  from  the  better  performers.  These  findings  par¬ 
allel  the  experience  with  the  Navy  instrument. 

Pilot  studies  demonstrated  that  performance  with  the  stereoptome¬ 
ter  was  insensitive  to  the  engineering  variables  that  had  been  so  deva¬ 
stating  in  the  production  rangefinders.  However,  range  settings  did 
reflect  the  same  limitation  which  frustrated  the  stereo  operator  when 
working  against  ground  targets.  The  presence  of  stimuli  in  the  field 
of  view  which  interferred  with  free  movement  in  depth  of  the  stereo 
reticle  distorted  the  measures  obtained,  i.e.,  the  integrity  of  the 
reticle  was  lost  when  projected  on  a  near  background  or  intermr  'iate 
object.  The  Zaroodny  ballistic  sight  (5)  illustrated  this  feature  of 
stereo  displays  as  did  the  study  by  Irvine  C.  Gardner  of  the  National 
Bureau  of  Standards  (4).  Gardner  used  a  stereo  instrument  which  per¬ 
mitted  both  ortho-  and  pseudoscopic  viewing  to  range  on  seventeen  tar¬ 
gets  in  the  Washington  skyline.  The  resultant  mirror  image  of  range 
displacements  documents  two  points:  1)  the  position  in  depth  of  a 
stereo  reticle  is  influenced  by  perceptual  factors,  and  2)  the  resul¬ 
tant  displacement  in  depth  of  a  stereo  reticle  is  unique  to  the  indivi¬ 
dual  target  and  its  surroundings.  With  the  tank  mounted  stereo-range¬ 
finder  this  was  evident  in  gross  inaccuracies  of  determined  range  to 
targets  on  a  forward  slope  or  in  front  of  an  immediate  background. 

Range  readings  were  short  if  the  operator  kept  clearance  between  his 
reticle  and  the  background  or  were  long  if  he  lost  clearance  and  drove 
the  reticle  into  the  background. 

To  sum  up  the  stereo-rangefinder  experience,  the  almost  universal 
utility  of  the  stereoptometer  relative  to  that  of  the  three  production 
rangefinders  supports  Harker' s  law  -  a  law  akin  to  Murphy's  law  -  which 
states:  An  observer  will  see  stereo  as  an  inverse  function  of  the  num- 
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Fig.  1  Measures  are  standard  deviation  of  twelve  settings  with 
random  offset -to  a  target  at  302  cm  distance.  Extreme 
values  are  summed  at  the  riqht.  Five  weeks  of  range¬ 
finder  training  separated  test  and  retest. 
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ber  of  surfaces  in  the  binocular  paths.  A  possible  explanation  for 
this  became  apparent  in  subsequent  research  with  the  stereoptometer 
which  demonstrated  that  symmetrical  vergence  could  be  diagnostic  with 
individuals  who  had  trouble  seeing  stereo  but  who  were  otherwise  vis¬ 
ually  normal.  With  symmetrical  vergence»  the  reported  direction  of 
reticle  movement  when  left  or  right  rather  than  in  depth  is  referrable 
to  the  use  of  a  specific  eye.  A  change  of  vergence  in  the  instrument 
to  frustrate  the  observer's  inappropriate  eye  use  can  be  sufficient  to 
elicit  a  full  stereoscopic  response  or  the  change  will  confirm  the 
initially  determined  eyedness.  When  an  individual  reacts  with  a  stereo 
response  under  these  circumstances,  it  is  evident  that  he  initially 
suppressed  vision  in  one  eye  in  response  to  some  feature  of  the  presen¬ 
tation.  Thus,  spontaneous  suppression  could  account  for  the  failure, 
as  with  the  rangefinder,  of  an  instrument  or  a  display.  In  retrospect, 
the  problems  that  beset  the  stereo-rangefinder  suggest  that  we  lack 
basic  understanding  of  the  stereo  processes. 
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The  potential  benefits  of  3-D  or  stereoscopic  visual  displays  have 
not  been  fully  appreciated  in  a  number  of  important  application  areas. 
This  may  be  due  to  problems  in  the  way  stereo  systems  have  been  evalu¬ 
ated  with  respect  to  non-stereo  displays. 

Only  a  few  experimental  evaluations  have  shown  performance  advan¬ 
tages  for  stereoscopic  displays,  whereas  most  comparisons  have  shown 
little  or  no  stereo  benefit.  In  some  cases,  performance  with  stereo 
was  worse  than  with  a  non-stereo  system. 

Proper  experimental  evaluation  of  3-D  visual  display  systems 
requires  attention  to  the  following  factors: 

—  Equal  display  quality  in  the  stereo  and  non-stereo  systems. 

—  Performance  measurement  with  tasks  that  realistically  represent 
the  perceptual  complexity  of  the  operational  environment,  and 
the  learning,  time  constraints,  and  error  penalties  present 
in  the  real  world. 

—  Appreciation  of  the  several  side  benefits  obtained  with  stereo¬ 
scopic  visual  displays,  such  as  improved  image  interpretability 
and  wider  field  of  view,  as  well  as  better  system  reliability. 

—  Reconsidering  the  practicality  of  stereoscopic  techniques  in 
applications  where  stereo  has  previously  been  thought  to 
be  of  little  value,  as  in  flight  simulator  displays. 

Much  of  the  research  in  stereoscopic  vision  is  focused  on  how 
the  human  visual  system  works  to  derive  depth  information  from  the 
disparity  between  left  and  right  retinal  images,  and  on  techniques  for 
producing  appropriate  left  and  right  retinal  inputs  to  the  two  eyes. 

One  neglected  part  of  stereo  research  is  the  methodology  for  demonstra¬ 
ting  the  applications  in  which  stereo  can  be  worth  the  extra  cost, 
complexity,  and  in  some  cases,  the  discomfort  relative  to  non-stereo 
systems. 


145 


In  many  comparisons  between  performance  with  stereo  versus  non¬ 
stereo  displays,  the  stereo  system  was  a  poor  quality  experimental 
prototype  set  up  just  for  the  test,  while  the  non-stereo  system  was 
a  high-quality  commercial  display.  In  a  number  of  laboratories,  it 
was  observed  that  researchers  were  working  with  the  left  TV  camera 
connected  to  the  right  eye  display,  and  vice  versa;  this  produced  re¬ 
versed  binocular  depth,  but  the  observers  were  not  able  to  tell  why  the 
display  "never  looked  quite  right"  until  a  visiting  colleague  reversed 
the  camera  cables. 

Many  oi  the  comparisons  between  stereo  and  monoscopic  displays 
have  used  stereo  display  techniques  that  introduced  annoying  flicker, 
coarser  vertical  resolution,  reduced  field  of  view,  binocular  mis¬ 
alignment,  uncomfortable  viewing  equipment,  and  other  extraneous  fact¬ 
ors  into  the  experimental  test.  Proper  design  and  construction  of 
stereoscopic  viewing  equipment  requires  strict  attention  to  alignment 
and  congruence  between  the  two  eye  channels;  as  in  the  manufacture  of 
good  quality  binoculars,  close  tolerances  must  be  observed  in  the 
image  acquisition  and  display  systems.  The  eyestrain  and  discomfort 
that  can  result  from  inattention  to  the  special  requirements  of  stereo 
systems  may  be  responsible  for  test  results  wherein  performance  with  a 
3-D  system  is  worse  than  with  a  2-D  system. 

Certain  stereo  applications  could  not  practically  be  evaluated 
in  the  past,  due  to  the  state  of  the  art  in  display  technology.  Now, 
however,  it  is  possible  to  conduct  a  proper  comparison  of  performance 
with  a  stereo  system  that  is  equal  in  visual  comfort  and  resolution  to 
the  non-stereo  system.  This  would  provide  data  on  the  stereo/mono 
factor  alone,  unconfounded  by  ease  of  operator  use  and  all  those  other 
problems  that  have  plagued  stereo  systems  in  the  past. 

In  1964,  Kama  and  DuMars  compared  performance  on  a  simple  peg- in¬ 
hole  task  using  a  through-the-wall  master-slave  remote  manipulator  with 
force  feedback,  viewed  either  with  stereo  TV  or  conventional  2-D  TV. 
There  was  no  significant  difference  between  performance  times  with  3-D 
as  opposed  to  2-D  TV,  although  during  practice  sessions  the  average 
time  with  non-stereo  TV  was  81  seconds  while  3-D  TV  required  63  seconds. 
Observing  that  in  this  test  the  stereo  TV  had  only  half  the  resolution 
of  the  non-stereo  TV,  Chubb  (1964)  used  the  same  subjects  and  apparatus 
to  compare  performance  using  direct  viewing  through  the  hot-lab  window, 
with  either  one  eye  (mono)  or  two  eyes  (stereo).  Performance  times 
for  40  performance  trials  (.presented  in  blocks  of  5  trials  mono  and 
stereo,  balanced  across  subjects)  were  longer  with  monocular  viewing 
than  for  binocular  viewing.  Both  mean  time  and  variance  were  greater 
in  mono,  with  mono  taking  about  20  percent  longer  than  stereo  for 
these  well-practiced  subjects.  Although  the  novelty  of  one-eyed  view¬ 
ing  may  account  for  some  of  the  longer  performance  times,  there  was 
certainly  no  problem  in  unequal  resolution  or  field  of  view  between 
mono  and  stereo. 
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Smith,  Cole,  Merritt,  and  Pepper  (1979)  found  a  similar  20  percent 
increase  in  performance  time  for  mono  TV  compared  with  stereo  TV,  using 
the  same  type  of  through-the-wall  manipulator  and  a  variable  peg-in-hole 
task.  Under  both  clear  and  moderately  degraded  visibility  conditions, 
the  average  performance  times  for  highly  practiced  subjects  was  20  per¬ 
cent  longer  with  non-stereo  TV,  even  though  the  stereo  TV  had  only  half 
the  vertical  resolution  of  the  mono  system.  In  addition,  the  variance 
with  stereo  TV  was  considerably  less  than  with  the  mono  system.  These 
results  were  obtained  using  a  within-subjects  design,  with  each  subject 
trained  to  asymptotic  performance.  The  peg  task  board  was  rotated  and 
elevated  to  a  new  position  for  each  trial;  stereo  was  used  first,  then 
without  changing  board  position,  the  trial  was  repeated  using  mono  TV. 
This  was  done  to  ensure  that  whatever  learning  advantage  occurred  would 
help  in  the  mono  TV  mode.  The  mono-stereo  display  factor  was  signifi¬ 
cant  at  the  0.0025  level,  even  for  this  task  rich  in  2-D  depth  cues. 

A  second  experiment  was  conducted  with  the  apparatus  described 
above,  but  in  this  case  a  between-groups  design  was  used,  with  un¬ 
practiced  subjects  (a  limited  amount  of  familiarity  with  the  manipu¬ 
lator  was  permitted,  but  not  with  the  task  itself).  In  this  test, 
there  was  no  significant  mono-stereo  display  effect,  probably  because 
the  subjects  were  spending  most  of  the  performance  time  (200  to  £00 
percent  longer  than  the  practiced  group)  learning  how  to  do  the  task. 
The  variability  among  individual  performances  in  approaching  this  task 
makes the  between-groups  design  unsuitable  for  detecting  the  display 
effect — the  mono-stereo  factor  was  submerged  in  the  noise. 

A  different  tusk,  however,  showed  highly  significant  mono-stereo 
TV  effects  even  though  a  between-groups  design  was  used.  This  task, 
unlih*  the  peg-task  described  above,  was  not  rich  in  monocular  cues 
to  depth  and  shape.  It  represented  a  realistic  undersea  situation 
wherein  task  objects  are  often  obscured  by  marine  growth  and  sediment, 
and  the  usual  cues  of  shadow,  size,  interposition,  and  perspective  may 
be  severely  limited.  The  average  performance  times  were  40  to  75  per¬ 
cent  longer  with  2-D  TV  than  with  3-D  TV,  despite  the  poorer  vertical 
resolution  and  annoying  flicker  of  the  3-D  system  that  was  then  in  use. 
The  number  of  errors  was  100  to  170  percent  higher  with  non-stereo  TV. 

These  three  experiments  in  1979,  and  tne  two  in  1964,  are  present¬ 
ed  as  examples  of  how  the  methodology  used  to  test  the  advantages  of 
stereoscopic  displays  versus  non-st^.reoscopic  displays  can  produce 
either  a  highly  significant  stereo  effect,  or  no  significant  effect 
at  all. 

It  would  seem  likely  that  as  stereo  display  evaluations  are  con¬ 
ducted  with  new  technologies  now  becoming  available  (e.g.,  solid  state 
cameras  and  displays,  automated  stereo  alignment  and  image  matching), 
and  experimental  evaluations  are  conducted  with  operationally  realistic 
tasks  and  appropriate  test  procedures,  there  will  be  Increased  utiliza¬ 
tion  of  3-D  displays  for  data  analysis  and  for  remotely  manned  systems. 
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Extra  Benefits  from  Stereo  Displays 

In  addition  to  the  improved  spatial  perception  and  visual-motor 
coordination  obtained  with  stereoscopic  displays,  there  are  a  number 
of  important  side-benefits  that  are  often  overlooked  in  considering 
use  of  a  3-D  system. 

Stereo  display  systems  are  usually  thought  of  primarily  as  aids 
to  seeing  where  things  are  in  3-dimensional  space.  Stereo  also  provides 
a  tremendous  advantage  in  seeing  what  things  are  in  an  unfamiliar  scene 
or  unexpected  arrangement  of  familiar  objects  (e.g.,  a  salvage  situa¬ 
tion)  . 

Stereopsis  derived  from  binocular  retinal  disparity  provides  an 
unambiguc>«s  and  primary  visual  separation  of  figure  and  ground  without, 
paradoxically,  having  to  see  and  object  before  separating  it  from  the 
background.  As  many  photointerpreters  have  found,  stereo  is  often 
essential  for  rapid  and  accurate  initial  perception  of  objects  in 
the  rcene,  especially  when  the  imagery  has  low  resolution,  poor  con¬ 
trast,  or  noise  that  camouflage s  the  signal.  In  fact,  the  poorer  the 
image  quality  (typical  of  LLLTV  or  FLIR  imagery)  the  more  stereo  can 
help  in  initial  target  acquisition  and  identification;  this  is  because 
the  target  image  signal  is  correlated  in  the  left  and  right  images, 
while  the  noise  (assuming  independent  channels)  is  not.  It  is  a  pro¬ 
perty  of  the  binocular  stereopsis  system  that,  it  can  reject  uncorrela¬ 
ted  noise  while  retaining  those  image  points  that  are  correlated  in 
a  depth  plane  reasonably  close  to  the  fixation  plane. 

The  limited  resolution  and  gray  scale  typical  of  current  systems 
to  aid  pilot  vision  in  low  levels  of  illumination  or  in  poor  visibility 
may  benefit  greatly  from  stereo  display  techniques,  especially  for 
nap-of-the-earth  flight  and  low-level  target  acquisition.  Stereo  can 
help  sort  out  the  masses  of  poorly  resolved  terrain  and  foliage  that 
are  jumbled  together  in  a  conventional  2-D  display  (particularly  be¬ 
cause  the  limited  gray  scale  gives  little  information  from  interposi¬ 
tion  cues) . 

Improved  resolution  versus  f ield-of-view  is  contributed  by  the 
extra  information  in  two  image  channels  versus  one  in  a  non-stereo 
system.  Just  as  a  person  can  read  an  eye-chart  better  with  two  eyes 
than  with  one,  the  effective  visual  performance  with  a  stereo  TV  dis¬ 
play  could  be  expected  to  exceed  than  for  a  comparable  mono  TV  system 
by  about  40  percent.  This  would  permit  either  much  better  resolution 
with  the  same  field  of  view,  or  a  bigger  field  of  view  with  the  same 
resolution,  as  compared  to  a  mono  system  comprising  the  same  video 
hardware  in  a  single  channel.  This  means  that  the  current  state  of 
video  hardware  can  be  purchased  in  duplicate  to  achieve  significantly 
better  seeing  for  the  human  operator.  In  addition,  two  independent 
channels,  lixe  two  engines  on  an  aircraft,  provide  a  contingency  mode 
in  case  one  of  the  camera/display  channels  should  fail. 
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New  Applications  for  Stereo 

One  of  the  reasons  for  not  properly  evaluating  stereo  in  certain 
applications  is  the  belief  that  it  would  have  no  relevance,  and  thus 
it  is  never  tried. 

It  is  often  said  that  3-D  displays  are  not  needed  in  flight 
simulators,  since  binocular  stereopsis  in  normal  human  vision  is 
not  a  strong  cue  beyond  several  hundred  feet  in  visual  range. 

Certain  flying  tasks  are  now  becoming  increasingly  common  where  the 
visual  distances  involved  fall  well  within  human  stereo  thresholds. 
Nap-of-the-earth  flight,  low-level  attack  missions,  VSTOL  take-off 
and  landing,  and  other  flight  operations  rely  on  binocular  cues  in  the 
real  world.  By  providing  stereo  cues  in  the  flight  simulator  display, 
trainees  can  begin  to  learn  these  cues  just  as  they  will  eventually 
in  the  real  world.  Other  examples  of  close  visual  distances  in  flight 
are  in-flight  refueling  and  formation  flying,  where  depth  differences 
as  little  as  6  inches  are  resolved  at  a  distance  of  30  feet. 

New  types  of  operational  requirements  and  new  types  of  video 
display  hardware  suggest  a  re-examination  of  those  areas  where  stereo 
benefits  may  be  worth  the  extra  cost  and  inconvenience  of  a  dual 
channel  display. 

Whatever  the  application  and  the  hardware  selected  for  intial 
evaluation,  the  methodology  for  comparing  3-D  versus  2-D  systems  is 
extremely  critical  for  a  proper  assessment  of  the  costs  and  benefits. 


REFERENCES 

Chubb,  G.P.  (1964).  A  Comparison  of  Performance  in  Operating  the  CRL- 
8  Master  Slave  Manipulator  under  Monocular  and  Binocular  Viewing 
Conditions.  Aerospace  Medical  Research  Laboratories,  Wright-Patterson 
Air  Force  Base,  Ohio.  MRLD-TDR-64-68  (AD  608791). 

Kama,  W.N.  &  DuMars,  R.C.  (1964).  Remote  Viewing:  A  Comparison  of 
Direct  Viewing,  2-D  and  3-D  Television.  Report  AMRL-TDR-64-15,  6570th 
Aerospace  Medical  Research  Laboratories,  Wright-Patterson  Air  Force 
Base,  Ohio. 

Smith,  D.C.,  Cole,  R.E.,  Merritt,  J.O.,  Pepper,  R.L.  (1979).  Remote 
Operator  Performance  Comparing  Mono  and  Stereo  TV  Display*'  the  Effects 
of  Visibility,  Learning  and  Task  Factors.  Naval  Ocean  Systems  Center, 
San  Diego,  California.  NOSC  TR  380. 


149 


VISUAL  PERCEPTION  RESEARCH 
AT  NAVAL  OCEAN  SYSTEMS  CENTER 


Ross  L.  Pepper 
Box  997 

Kailua,  Hawaii  96795 


As  many  of  you  know,  the  Naval  Ocean  Systems  Center  has  played  a 
leading  role  in  the  development  of  a  number  of  undersea  vehicles  and 
work  systems,  both  manned  and  unmanned.  For  example,  the  family  of 
curve  vehicles,  Curve  1,  Curve  2,  Curve  3,  and  RUWS,  the  remote  under¬ 
water  work  system,  their  contemporaries,  AUWS  and  the  advanced  tethered 
vehicle  that  we  are  working  on  today,  are  all  products  of  NOSC's  explor¬ 
atory  development  efforts. 

I  have  to  give  credit  to  Dr.  Robert  Cole  of  the  University  of 
Hawaii,  who  has  been  my  constant  colleague  since  I  became  involved  in 
vision  research.  Our  early  work  included  John  Merritt  of  HFR  and  David 
Smith,  an  engineer  at  NOSC. 

When  we  initiated  research  in  3D  displays  at  NOSC  to  support  the 
undersea  vehicle  program,  I  felt  that  we  should  employ  the  best  visual 
systems  that  were  available.  Initially,  I  encountered  a  lot  of  resis¬ 
tance  to  the  idea  of  stereo  television,  even  at  NOSC.  The  prevailing 
attitude  was  that  it  had  been  tried  but  it  doesn't  work.  It  causes  eye 

strain.  It's  too  complicated  and  it's  too  costly,  so  we  don't  want  it! 

I  began  to  survey  the  literature  to  try  to  verify  some  of  these  claims. 

The  literature  indicated  that  there  was  no  significant  performance 

advantage  to  stereo  displays  compared  to  conventional  TV  displays,  and 
in  some  cases  the  stereo  systems  were  found  to  produce  results  that  were 
poorer  than  the  conventional  systems.  I  found  this  hard  to  believe. 
After  all  the  findings  in  perceptual  research  under  direct  experience 
conditions  which  consistently  show  a  tremendous  advantage  to  binocular 
vision  with  appropriate  controls  to  eliminate  motion  paralax  cues,  ster¬ 
eo  acuity  thresholds  are  nearly  a  magnitude  smaller  than  mono  acuity 
thresholds  when  tested  in  an  apparatus  like  the  Howard-Dolman 
situation. 

The  results  of  our  early  work  suggest  that  manipulator/operator 
performance  under  simulated  undersea  work  conditions  is  determined  by  a 
complex  interaction -of  several  important  factors.  These  factors  are  the 
visual  information  available  to  the  operator,  including  the  visibility 
conditions  and  the  sensitivity  of  the  display-sensor  system;  the 
manipulator  capability;  the  task  requirements  imposed  upon  the  operator; 
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and  the  operator's  capacities  themselves,  that  is,  his  experience,  his 
learning  abilities,  and  his  motivation  to  perform. 

Under  controlled  conditions  with  tasks  which  required  the  operator 
to  position  the  manipulator  end-effector  in  the  Z  axis  or  depth  plane, 
performance  was  always  superior  when  stereo  systems  were  compared  to 
conventional  TV  systems.  The  performance  advantage  with  stereo  was  even 
greater  under  degraded  visibility  conditions. 

Our  first  studies  were  valuable  to  me  for  a  number  of  reasons. 
First,  at  least  in  my  own  mind,  I  unravelled  the  inconsistency  in  the 
literature  regarding  the  meager  support  for  stereo  versus  conventional 
TV  displays.  These  variables  that  I  found  to  account  for  the  discrep¬ 
ancy  were  poorly  conceived  experiments,  inferior  stereo  systems  (which 
exist  even  today),  and  little  or  no  control  over  learning  and  practice 
effects  on  the  part  of  the  operators.  I  acquired  an  appreciation  for 
the  immense  human  factors  engineering  of  man-machine  interface  problems 
that  exist  in  employing  stereo  displays,  especially  when  we  ultimately 
seek  to  extend  this  sensory  capacity  to  the  operator.  This  appreciation 
led  my  colleagues  and  1  to  develop  a  systematic  approach  to  the  analysis 
of  the  necessary  and  sufficient  display  conditions  responsible  for  the 
various  levels  of  operator  or  teleoperator  performance.  We  are  current¬ 
ly  employing  this  display  performance  transform  method  to  evaluate  a 
variety  of  display  features  which  are  state-of-the-art  or  which  show 
promise  to  extend  man's  capabilities. 

The  recognition  of  the  human  factors  complexities  involved  in  tele¬ 
operator  displays  became  apparent  during  the  course  of  our  research.  We 
discovered  an  interesting  illusory  movement  that  occurs  when  one's  head 
is  translated  from  side  to  side  in  a  horizontal  plane  while  viewing  a 
stereo  TV  display.  The  apparent  motion  of  the  stereo  targets  which 
result  from  lateral  head  movements  is  like  true  motion  parallax,  that 
is,  movements  which  are  proportional  to  their  distance  from  the  converg¬ 
ence  plane  but  in  the  opposite  direction  of  true  motion  parallax.  This 
illusory  motion  is  thought  to  be  the  result  of  a  central  compensation 
mechanism  which  compares  head  movements  with  retinal  image  movements  in 
order  to  maintain  a  stabilized  image  of  the  environmental  objects. 

Regardless  of  its  illusory  nature,  it  seemed  reasonable  to  expect 
that  the  relationship  which  holds  between  the  apparent  distance  of 
objects,  the  convergence  angle  of  the  cameras,  and  the  degree  of  what  we 
term  the  "pseudo-parallax"  of  these  objects,  could  be  used  by  the  visual 
system  in  much  the  same  way  that  true  motion  parallax  is  used.  Results 
of  our  initial  study  of  this  phenomena  indicated  that  the  "pseudo-paral- 
las"  cues  did  not  improve  the  performance  associated  with  the  use  of 
stereo  cues  alone.  While  this  result  cast  some  doubt  on  the  usefulness 
of  head  movements  in  conjunction  with  stereo  displays  in  which  camera 
positions  are  fixed  during  the  given  task,  it  does  not  detract  from  the 
idea  that  an  isomorphic  head-coupled,  camera-aiming  system  could  produce 
substantial  benefits  in  teleoperator  performance.  Such  a  system  would 
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not  only  produce  true  motion  parallax  cues  to  depth  but  would  also  allow 
the  operator  to  visually  search  the  remote  work  site  in  a  manner  analog¬ 
ous  to  direct  experience. 

The  second  variable  of  interest  in  these  studies  is  lateral  camera 
separation,  which  results  in  a  magnification  of  the  retinal  disparity 
cue  to  depth.  In  general  we  found  that  with  TV  displays,  stereo  acuity 
provides  the  most  substantial  gain  in  the  transition  from  mono  to  stereo 
viewing  conditions.  In  this  earlier  study,  this  two-fold  increase  in 
performance  occurred  with  camera  separations  set  at  approximately  half 
the  interocular  distance  of  the  human  eye.  With  camera  separation 
increased  to  normal  interocular  distances,  then  beyond  into  the  region 
of  hypers tereops is ,  we  observed  a  gradual  but  diminishing  increase  in 
stereo  acuity  to  a  level  approximately  that  found  under  direct  viewing 
conditions.  Thus,  enhancing  the  retinal  disparity  cues  to  depth  through 
increasing  camera  separation,  teleoperator  performance  can  be  substanti¬ 
ally  improved. 

In  our  most  recent  work,  we  elected  to  obtain  a  pure  measure  of 
hyperstereopsis  by  eliminating  the  cue  conflict  inherent  in  our  previous 
study.  A  new  stimulus  presentation  apparatus  was  constructed  in  a  room 
which  could  be  totally  darkened.  The  apparatus  consisted  of  two  paral¬ 
lel  guide  ways  from  which  light  sources  are  suspended.  This  enables  us 
to  present  luminous  two-dimensional  targets  along  the  observer's  Z-axis 
plane.  We  additionally  built  a  camera  station  which  is  easily  moveable 
with  respect  to  distance  from  the  targets.  This  enabled  us  to  examine 
the  joint  effects  of  camera  separation  and  discance  on  stereo  acuity. 

The  results  of  this  study  paralleled  those  of  the  initial  effort  using 
the  Howard-Dolman  apparatus.  For  all  three  conditions  employed  ere  was 
a  subsantial  gain  in  performance  associated  with  the  transition  from 
mono  to  stereo  viewing  conditions.  Further  increases  in  camera 
separation  led  to  gradual  but  diminishing  increases  in  stereo  acuity. 

At  the  largest  camera  separation  tested,  38  cm,  performance  was  similar 
to  that  observed  under  direct  view.  It  is  important  to  bear  in  mind 
f’  at  while  hyperstereopsis  is  successful  in  promoting  stereo  acuity  in 
this  very  simple  pes ceptual  judgment  task,  its  effect  under  more 
visually  complex  perceptual  and  perceptual  motor  tasks  still  require 
study.  We  cannot  simply  assume  that  these  variables  will  be  as  effec¬ 
tive  in  more  complicated  stimulus  situations.  Our  approach  to  obtaining 
this  kind  of  knowledge  consists  of  carefully  designed  and  carefully 
controlled  studies.  While  we  continue  to  be  occupied  with  this  direc¬ 
tion  of  research,  my  engineering  colleagues  at  NOSC  are  making  strides 
in  developing  the  hardware  systems  for  a  future  generation  of  general 
purpose  teleoperators.  Presently,  an  isomorphic  head-coupled  camera- 
-aimin,;  teleoperator  system  is  near  completion.  It  has  a  flexible  spine 
with  a  pan  and  tilt  mechanism.  The  head  employs  two  small  CCD  Panasonic 
cameras  which  permit  a  close  interocular  distance,  approximating  that  of 
the  human  eye.  The  teleoperator  also  has  a  stereo  heating  system  with 
microphones  located  in  the  environment.  The  latest  of  this  type  of 
teleoperator  system  will  be  made  available  with  two  sets  of  pan  and  tilt 
units,  one  on  the  lower  back  and  one  at  the  juncture  of  the  shoulder. 
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When  properly  controlled  with  the  computer,  this  new  teleoperator  will 
enable  us  to  employ  isomorphic  movement  to  determine  the  contribution  of 
head  motion  parallax  in  various  perceptual  judgment  and  perceptual  motor 
tasks. 

I  think  it  is  important  to  recognize  that  the  complexity  of  these 
systems,  depending  upon  the  number  of  variables  that  you  want  to  build 
in,  need  to  be  carefully  evaluated.  They  may  place  additional  demands 
on  the  operator  which  may  or  may  not  be  offset  by  the  value  of  the  cues 
that  are  available.  It  is  only  by  measuring  performance  that  you  can 
determine  whether  this  value  is  worth  the  cost  in  maintenance,  the  cost 
in  reliability,  and  the  initial  development  cost  itself.  I  think  these 
trade-offs  can  be  best  assessed  by  the  systematic  gathering  of  data  in 
the  way  that  we  are  proceeding.  Thank  you. 


Part  III:  Panel  Discussion  -- 

Applicability  of  3-D  Display 
Research  to  Military  Operational  Needs 
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The  Naval  Air  Development  Center  is  responsible  for  research  and 
development  in  support  of  Naval/Marine  Corp  aviation.  This  responsi¬ 
bility  includes  the  research  and  development  associated  with  control/ 
display  technologies  and  systems  for  a  wide  variety  of  fixed-wing  high 
perf ormance ,  fixed-wing  low  performance  and  rotary  wing  mission  appli¬ 
cations.  Inherent  in  this  research  and  development  is,  in  general,  an 
advocacy  for  appropriate  stereo  displays.  Basic  vision  research  and 
human  factors  experiments  are  focusing  on  stereo  display  phenomena  and 
stereo  applications.  The  display  technology  and  hardware  system  devel¬ 
opment  is  being  done  anticipating  the  need  for  stereo,  two-eye  presen¬ 
tations.  This  may  involve  the  need  for  color  (depending  on  the  stereo 
display  technique  chosen)  or  the  need  for  modular  system  configurations 
to  handle  single  eye  vs.  two-eye  presentation  requirements. 

In  the  case  of  advanced  display  technology  development,  the  Navy's 
work  in  this  area  is  coordinated  with  the  Air  Force,  Army,  and  NASA  via 
several  Tri- Service  working  groups.  This  interaction  has  further 
served  to  coordinate  portions  of  the  control/display  development  for 
the  airborne  community  with  technology  development  for  the  ships,  land 
based  vehicles  and  man-portable  systems  efforts  as  well.  In  address¬ 
ing  the  topic  of  military  operational  requirements  for  3D  displays, 
these  interactions  aided  in  compiling  the  listing  shown  in  Figure  1. 
This  listing,  while  not  meant  to  be  comprehensive  at  all,  represents 
areas  where  one  or  more  of  the  Services  have  been  involved  in  applied 
research  and  development  associated  with  stereo  displays  over  the  past 
ten  to  fifteen  years. 

The  application  areas  represented  in  Figure  1  are  extremely 
varied.  Much  of  the  early  R&D  done  by  the  military  probably  had  as 
its  operational  objective  remote  manipulation  and  ordinance  disposal. 
Many  of  the  applications  have  dealt  with  the  use  of  a  stereo  display 
presentation  as  a  vehicle  control  or  pilotage  aid.  These  vehicle 
control  applications  have  encompassed  rotary  wing,  as  well  as  fixed 
wing  manned  vehicle  flight  control,  remotely-piloted  vehicles  (both 
air  and  ground  based) ,  major  efforts  in  undersea  vehicle  movement, 
rescue  and  manipulation,  and  upgrading  biocular  to  binocular  display 
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•  REMOTE  MANIPULATION 

•  BOMB  DISPOSAL 

•  VEHICLE  CONTROL/ PILOTAGE 

-  ROTART  WING 

-  FIXED  WING  HIGH  PERFORMANCE 

-  REMOTELT  PILOTED  VEHICLES  (AIR  &  GROUND  BASED) 

-  UNDERSEA 

-  COMBAT  VEHICLES 

•  RECONNAISSANCE/TARGET  ACQUISITION 

■  DOWNWARD  LOOKING 

-  FORWARD  LOOKING 

-  VISUALLY  COUPLED 

•  AIR-TO-AIR  REFUELING 

•  FIRE  CONTROL/ WEAPON  DELIVERY 

•  COMPUTER  GENERATED  INFORMATION/ IMAGERY  FOR 

-  MANEUVERING  FLIGHT  PATH  GUIDANCE 

-  VISUAL  SCENE  SIMULATION 


Figure  1.  Military  Applications  for 
Three  Dimensional  Displays 


presentations  for  combat  vehicle  control.  Separate  and  distinct  from 
vehicle  control  have  been  investigations  in  reconnaissance  or  target 
acquisition.  Efforts  in  both  downward-looking  imagery  and  forward- 
looking  stereo  sensors  have  been  accomplished.  In  these  areas  and  in 
some  of  the  vehicle  control  areas,  some  investigations  have  dealt  with 
accentuated  stereo  display  presentations,  and  some  work  has  been  done 
with  visually-coupled  stereo  presentations  using  head  tracker  and 
helmet-mounted  display  technologies.  Stereo  has  been  investigated  as  a 
display  aid  for  the  final  phase  in  air-to-air  refueling  missions.  In 
the  fire  control  and  weapon  delivery  area,  various  DoD  laboratories  have 
looked  at  the  stereoscopic  presentation  of  fire  control  symbology  as  a 
performance  enhancement  aid  in  weapon  delivery.  Finally,  in  the  area  of 
computer  generated  symbology  and/or  imagery,  whether  as  a  flight  control 
aid  such  as  maneuvering  flight  path  guidance  in  the  air,  or  groundbased 
visual  scene  simulation,  biocular  and/or  binocular  display  presentations 
are  involved. 

As  mentioned  earlier,  the  Naval  Air  Development  Center  is  focusing 
on  the  airborne  community  with  its  R&D  efforts.  If  there  is  truly  a 
requirement  to  get  stereo  into  the  air,  that  is  if  a  system  development 
or  airframe  development  program  manager  needs  a  stereo  display  capa¬ 
bility,  there  are  several  hardware  options  available.  One  such  option 
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is  the  use  of  a  helmet-mounted  display.  Shown  in  Figure  2  is  an  example 
of  a  class  of  binocular  helmet-mounted  displays  under  development  for 
high  performance  aircraft  applications. 


Figure  2.  Binocular  Helmet-Mounted  Display 
For  High  Performance  Aircraft 

The  binocular  helmet-mounted  display  shown  in  Figure  3  is  typical  of 
rotary  wing  helmet-mounted  display  systems.  Both  of  these  systems  offer 
the  potential  for  providing  two  independent  images  to  the  airborne  crew 
member.  Another  stereo  display  hardware  option  is  the  use  of  a  device 
such  as  the  one  shown  in  Figure  4.  This  is  the  optical  relay  tube  in 
the  new  AH-64  Advanced  Attack  Helicopter.  Similar  devices  have  been 
investigated  for  multi-crewmember  high  performance  aircraft  such  as 
F-lll.  The  crewmember  puts  his  face  "in  the  boot"  and  is  presented 
virtual  image  display  information.  A  device  like  this  uses  the  "boot" 
to  maintain  exact  head/eye  position,  and  could  therefore  be  used  to 
present  stereo  type  display  information  in  a  "heads-down  mode."  A  third 
display  hardware  option  exists  in  lieu  of  presenting  the  two  scenes  to 
the  operator  through  two  completely  independent  hardware  channels.  This 
option  involves  using  spectral  or  time  coding  of  the  information  on  a 
specially  modified  direct  view  display,  and  issuing  a  set  of  red/green 
glasses  or  PLZT-type  switching  glasses  to  the  operator.  An  example  of 
this  approach  is  shown  in  Figure  5. 


Brindle 


Figure  3-  Binocular  Helmet-Mounted  Display  for  Rotary  Wing  Aircraft 


Figure  4.  Optical  Relay  Tube  in  the  Advanced  Attack  Helicopter 
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Figure  5.  3-D  Display  Using  PLZT  Glasses 


Although  it  seems  fairly  elementary,  the  requirements  implied  in  a 
stereo  system  should  be  emphasized.  If  a  system  developer  needs  a 
stereo  system,  then  at  the  front  end  must  be  a  source  of  stereo  infor¬ 
mation  whether  it  is  a  pair  of  high  resolution  sensors  (forward  looking 
infrared  (FLIR),  low  light  level  TV  (LLLTV) ,  or  radar)  such  as  the 
example  shown  in  Figure  6,  or  a  stereo  set  of  symbology  which  the  opera 
tor  over- lays  on  the  real  world  such  as  the  example  shown  in  Figure  7, 
or  two  computer  generated  perspectives  of  a  computer  generated  scene 
such  as  the  one  shown  in  Figure  8.  Of  course  for  airborne  applications 
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Figure  €.  Low  Light  Level  Television  Imagery 
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Figure  8.  computer  Generated  Imagery 


two  onboard  sensors  or  a  single  sensor  with  sophisticated  optics/ 
electronics  to  achieve  a  perspective  view  of  the  world  are  required  to 
supply  real  world  stereo  video.  For  on-board  computer  generated 
symbology  or  imagery,  the  information  obviously  must  be  computed  twice 
to  achieve  the  stereo  or  perspective  information  display.  The  display 
end  of  the  system  has  similar  two  channel  requirements.  The  conceptual 
layout  of  a  helmet-mounted  display  shown  in  Figure  9  can  be  used  generi- 
cally  in  discussing  stereo  display  options.  The  requirement  exists  for 
two  generated  images,  two  sets  of  relay  optics,  and  two  final  presenta¬ 
tion  elements  for  the  operator  to  experience  a  stereo  display 
presentation. 
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Helmet  Display 
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Figure  9.  Helmet-Mounted  Display  Conceptual  Layout 


At  this  point  the  following  discussion  may  appear  to  be  a  digres¬ 
sion,  but  its  relevancy  to  the  point  to  be  made  will  become  apparent. 
Figure  10  shows  the  Navy's  new  F-18  cockpit.  The  Air  Force  F-16  could 
just  as  appropriately  be  represented  here,  since  the  F-18  crewstation 
is  representative  of  a  trc  id  in  the  airborne  community  acioss  the 
Services .  The  trend  is  toward  the  use  of  cathode-ray  tube  displays  in 
the  cockpit  replacing  electro-mcchanica.  instruments.  Technology 


Figure  10.  Navy  F-18  Cockpit  Configuration 


development  efforts  are  aimed  at  augmenting  the  displays  shown  with 
multi- line- rate  video  compatible  head-up  and  helmet-mounted  displays. 

So  these  trends  are  starting  to  get  electro-optical  display  capability 
(a  necessity  for  stereo)  into  the  cockpit.  The  displays  shown,  however, 
are  multifunction  displays  and  the  trend  in  system  architecture  is 
toward  a  bus- type  architecture,  such  as  the  one  shown  in  Figure  11, 
which  provides  the  crewmember  tremendous  capability  and  flexibility. 
With  this  type  of  system  architecture  any  information  can  be  put  up  on 
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Figure  11.  System  Architecture  Using  Parallel  Bus  Concept 


any  of  the  displays  in  the  crew  station.  If  one  display,  or  a  symbol 
generator  goes  down,  the  mult i-f unction-display/bus-architecture 
capability  allows  that  information  to  ba  presented  to  the  crew  member 
using  other  displays  and/or  symbol  generators.  Another  advantage  of  the 
trend  toward  multi-function  displays  is  the  ability  to  configure  an 
aircraft  cockpit  for  multiple  missions.  In  this  way,  by  reconfiguring 
the  cockpit  displays  an  aircraft  can  be  configured  for  air-to-air, 
air-to-ground,  or  reconnaissance  missions.  For  the  display  hardware 
developer  this  means  a  non-dedicated ,  non- specialized  display  with  a 
standardized  interface  requirement. 

It  should  also  be  pointed  out  that  there  are  signs  that  the  ground- 
based  combat  vehicles  community  may  be  headed  in  the  same  direction. 

This  is  occurring  with  the  trend  toward  increased  use  of  thermal  sensors 
(thermal  drivers  viewer,  independent  commanders  sight)  and  potential  use 
of  milimeter-wave  radar.  The  situations,  and  the  needs,  for  the  tank 
community  are  very  similar  to  those  for  the  airborne  community;  namely, 
multiple  operators,  multiple  sensors,  and  the  need  to  distribute  differ¬ 
ent  sensor  video  signals  for  viewing  by  different  crewmembers  at 
different  times  during  the  mission.  The  combat  vehicles  community, 
therefore,  will  probably  follow  the  trend  toward  a  bus-type  system 
architecture  with  standardized  multi-function  displays  and  a  standard¬ 
ized  disolay  interface. 


Brindle 

The  system  configuration  diagram  shovm  in  Figure  12  is  an  example 
from  another  high-technology  type  aircraft,  the  new  Advanced  Attack 
Helicopter.  Again  the  bus-type  architecture  is  evident.  This  example 
is  presented  here  because  it  contains  some  of  the  elements  needed  to 
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Figure  12.  Integrated  Helmet  and  Display  Sight  System 
Configuration  Diagram 


provide  a  stereo  system  capability.  Each  crewmember  is  provided  with 
a  monocular  version  of  the  helmet -mounted  display  shown  earlier. 

There  are  two  independent  FUR  sensors  in  a  pod  on  the  nose  of  the 
helicopter.  Normally  the  pilot  is  interfaced  via  his  helmet-mounted 
sight/display  to  one  FLIR  and  the  co-pilot /gunner  is  interfaced  to  the 
others  The  system  shown  does  not  represent  a  stereo  system,  but  does 
begin  to  include  some  of  the  ingredients  needed  to  provide  stereo  such 
as  the  two  sensors  and  "half"  of  a  binocular  helmet-mounted  display 
and/or  the  optical"  relay  tube  shown  earlier.  It  should  be  emphasized 
again  that  these  controls/displays  are  multi- function  controls/displays 
Referring  back  to  the  mission  applications  shown  in  Figure  1,  they  are 
used  for  the  vehicle  control/pilotage  part  of  the  mission  particularly 
in  the  nap-of-the-earth  flight  at  night.  They  are  used  for  navigation, 
reconnaissance,  and  target  acquisition  portions  of  the  mission.  They 
are  also  the  primary  control/display  interface  with  the  weapon  systems 
onboard  during  the  fire  control/weapon  delivery  portions  of  the  mission. 
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There  is  some  interest  in  transitioning  a  portion  of  this  capa¬ 
bility  to  the  Marine  Corps  for  use  on  a  different  airframe  with  only 
some  of  the  mission  application  areas  represented  by  the  Advanced 
Attack  Helicopter  system,  primarily  vehicle  control/pi lotage.  For 
this  application  only  a  single  FLIR  system  is  affordable,  and  the 
decision  tc  use  two  monocular  helmet-mounted  displays  will  probably 
be  tied  to  overall  system  cost  and  budget  constraints. 

With  all  of  this  information  as  background,  it  is  now  appropriate 
to  return  to  the  subject  of  stereo  and  make  the  point  of  this  paper 
by  putting  a  question  mark  after  its  title.  We  would  all  love  to 
have  stereo  displays  in  the  cockpit.  The  display  presentation  shown 
in  black  and  white  in  Figure  13  is  a  stereo  display  of  ground  terrain 
encoded  in  red/green  format.  Pilots  would  jump  at  the  chance  to  have 
a  presentation  of  this  type  as  a  3-D  electronic  moving  map  display. 
Given  a  set  of  red/green  stereo  glasses,  terrain  features  such  as 
mountains  and  ridges  would  appear  to  "stand  out"  and  even  the  display 
of  buildings  and  vehicular  targets  would  be  enhanced.  Presentations 
of  this  type  are  very  exciting,  and  the  display  technology  and  hard¬ 
ware  development  community  is  certainly  capable  of  developing  the 
display  system  capability  required  to  provide  them  but  it  is  not 
obvious  that  they  will  find  their  way  into  the  cockpit.  Wnat  may  get 


Figure  13.  Black  and  White  Version  of  a  Stereo  Display  or 
Ground  Terrain  Encoded  in  Red/Green  Format 
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Figure  14.  Analog  Maneuvering  Flight  Path  Guidance  Type  Display 


into  the  cockpit  is  a  two-dimensional  version  similar  to  the  analog 
pictorial  presentation  of  a  maneuvering  flight  path  guidance  type 
display  shown  in  Figure  14.  This  is  only  a  two-dimensional  analog 
presentation,  but  it  provides  the  pilot  all  of  the  motion,  depth,  and 
flight  control  cues  to  allow  him  to  fly  the  vehicle  according  to  a 
directed  flight  profile  and  avoid  threat  areas  as  well.  This  type 
presentation  does  not  meet  the  strict  definition  of  stereo,  the  sub¬ 
ject  of  discussion,  but  neither  does  it  require  a  specialized  display 
device  to  convey  it  to  the  the  crew  member.  For  a  true  stereo 
presentation  with  perspective,  a  binocular  display  capability  is 
required  along  with  the  true  "stereo"  information  to  present.  The 
question  raised  as  a  "devil's  advocate"  in  Figure  15  is  raised  from 
the  point  of  view  of  the  major  system  or  airframe  Program  Manager 
responsible  for  the  development  and  successful  operational  implementa¬ 
tion  of,  for  example,  over  500  Advanced  Attack  Helicopters  or  over  1000 
F-18  aircraft.  Does  two  plus  two  equal  too  much?  Does  a  two  channel 
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Figure  15.  Question  to  be  Considered  - 

Does  Two  Plus  Two  Equal  Too  Much? 

display  requirement  plus  two  sensors  or  sets  of  computer  information 
result  in  too  much  in  terms  of  overall  system  cost,  increased  system 
complexity/for  both  sensors  and  controls/displays,  crewmember  encumber- 
ment  in  the  form  of  a  binocular  helmet-mounted  display  or  head 
constraint  in  the  "boot"  of  an  optical  relay  tube  type  device, or 
increased  workload  perhaps  in  a  single  seat,  multiple  task/multi¬ 
mission  environment?  This  question  is  raised  to  instigate  and  to 
challenge.  It  is  raised  to  instigate  a  healthy  technical  interchange 
and  debate  among  the  various  communities  involved  in  the  DoD  process 
including  those  involved  in  the  basic  vision  research,  human  factors, 
sensor  and  control/display  technology  development, major  system  develop¬ 
ment,  and  the  operational  side  of  the  house,  the  military  user.  It  is 
raised  as  a  challenge  to  the  basic  research  and  technology  development 
communities  to  maintain  a  constant  awareness  of  real  operational 
problems  and  required  capabilities  within  the  fleet,  and  to  focus  the 
basic  research  on  those  areas  of  high  payoff,  and  technology  develop¬ 
ment  on  providing  the  required  increased  capability  in  on-board  systems 
in  a  way  that  is  both  cost  effective  and  compatible  with  the  trend  in 
multi- function  crew  stations  and  multi-mission  aircraft. 


PANEL  DISCUSSION  -  APPLICABILITY  OF  3D  DISPLAY 
RESEARCH  TO  OPERATIONAL  NEEDS 


Dr.  Roger  P.  Neeland 
Chief,  Airborne  Systems  Branch 
Systems  Research  and  Development  Service 
Federal  Aviation  Administration 


I  would  like  to  say  a  personal  word  before  I  start,  as  I  am 
really  here  wearing  two  hats  today.  There  has  been  very  little  Air 
Force  representation  so  I'll  wear  my  Air  Force  hat,  as  well  as  my 
FAA  hat.  I  am  assigned  to  FAA  at  the  present  time,  but  I  also  fly 
once  a  week  with  the  Air  Force  so  I  am  also  active  in  flying  as  well 
as  engineering.  Within  FAA,  I  have  a  branch  of  engineers  to  answer 
cockpit-crew  interface  questions  for  the  Systems  Research  and 
Development  Service.  I  would  like  to  talk  about  the  FAA  perspective 
on  operational  needs  for  3-D  displays  and  cover  this  divided  into 
two  generic  areas  of  FAA  interest.  The  first  area  is  airborne  or 
aircraft  applications,  with  which  I  feel  most  comfortable.  The 
second  area  is,  obviously,  the  ground  side  air  traffic  control 
responsibilities  of  FAA. 

First  of  all  I  feel,  especially  with  the  changing  political 
environment,  I  need  to  mention  something  about  the  FAA's 
responsibilities.  Perhaps  today  they  may  be  a  little  different  than 
the  responsibilities  that  those  of  you  who  have  worked  with  FAA  in 
the  past  may  recall.  Within  the  area  of  airborne  applications,  FAA 
is  primarily  pursuing  the  certification  responsibilities  we  must 
accomplish  to  assure  that  aircraft  and  systems  operating  in  the 
national  airspace  are  safe.  We  will  be  doing  less  actual 
development  of  airborne  systems,  and  will  rely  more  on  private 
enterprise  to  come  to  FAA  with  systems  that  need  to  be  certified. 
This  may  be  a  display  by  itself  or  displays  as  part  of  a  total 
system.  I  personally  feel  that  knowledge  of  the  potential  display 
methods  —  and  certainly  these  you  mentioned  today  fall  in  that 
category  —  are  important  to  FAA  in  being  able  to  exercise  this 
certification  responsibility  as  well  as  in  exercising  our 
responsibilities  in  the  areas  of  some  systems  we  have  to  design. 

FAA  is  responsible  for  such  things  as  collision  avoidance,  landing 
guidance  systems,  navigation  systems,  and  air  traffic  control 
procedures,  so  we  do  need  to  know  about  potential  displays  for  these 
systems.  On  the  air  traffic  control  side,  it  is  a  slightly 
different  situation  than  airborne  because  FAA  has  responsibility  to 
design  and  implement  these  systems,  including  the  displays.  A  note 
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on  air  traffic  control:  please  remember  that  these  responsibilities 
cover  a  very  large  geographic  area.  I  only  mention  that  because  it 
will  come  back  later  when  1  pass  on  the  comments  that  I  have  from 
our  air  traffic  control  people.  This  is  a  very  widely-spread 
geographic  responsibility.  Within  FAA,  air  traffic  control  system 
planning  for  the  next  20  years  is  pretty  well  underway  right  now. 
Some  of  you  may  have  seen  news  releases  in  the  last  few  days. 
Yesterday,  Mr.  Helms,  our  Administrator,  released  officially  his 
20-year  plan  oriented  toward  the  air  traffic  control  system  of  the 
future.  This  has  some  implications  for  hardware  and  certainly  for 
displays. 

Before  we  get  into  the  actual  applications,  I  just  want  to  say 
there  are  some  filters  that  I  apply  when  I  start  thinking  about  this 
technology  and  whether  or  not  to  use  it  in  a  particular  appli¬ 
cation.  Hopefully  we  all  do  this.  We  need  to  look  at  what  task  has 
to  be  performed  —  what  really  is  the  job?  Can  we  do  it  with 
simpler  displays  —  with  two-dimensional  displays?  If  we  can  do  it 
satisfactorily,  perhaps  we  don't  need  to  go  any  further.  Then  we 
need  to  ask  the  question  —  is  there  an  enhanced  capability  that 
would  really  cane  about  by  adding,  in  this  case,  the  third 
dimension?  Is  there  a  new  capability  that  can  be  defined  by  using 
this  third  dimension?  That  may  be  the  case  in  some  of  the  airborne 
applications  I  am  familiar  with.  Perhaps  it  is  not  a  matter  of 
improving  old  tasks  —  doing  them  better  —  but  perhaps  being  able 
to  do  new  tasks.  Practical  aspects  that  we  just  can't  lose  track  of 
include  the  fact  that  we  have  to  have  sensors  to  feed  these 
devices.  A  lot  of  times  we  can  come  up  with  very  nice  displays,  but 
we  can't  get  the  information  really  necessary  to  drive  that 
display.  That  doesn't  mean  that  we  stop  developing  the  display,  but 
we  don't  really  expect  to  be  able  to  implement  it  until  we  get  the 
sensors  we  need.  The  total  operating  environment  must  be 
considered.  Cockpits  get  pretty  noisy  and  vibrate  a  lot.  There  is 
limited  physical  space  in  the  cockpit  for  some  things  such  as 
volumetric  displays,  and  this  has  to  be  taken  into  account.  Other 
tasks  that  have  to  be  done  have  to  be  considered.  I  can  imagine  a 
pilot  flying  and  trying  to  use  stereographic  displays.  Usually  this 
implies  wearing  some  glasses  of  some  sort,  such  as  polarized  or 
colored.  This  might  interfere  with  other  tasks,  such  as  looking  out 
of  the  window  and  combining  visual  display  information  inside  the 
cockpit.  I  am  not  sure  you  are  aware,  but  right  now  I  cannot  fly 
with  polarized  lenses  in  my  sunglasses.  I  think  the  same  thing 
would  hold  true  for  stereographic  display  lenses  because  of 
irregular  windshields,  etc.  There  may  be  some  practical  limitations 
there.  There  are  other  tasks  the  pilot  has  to  do  so  that  he  may  be 
trying  to  use  3-D  information,  but  he  has  to  be  able  to  transfer 
back  and  forth  between  looking  at  a  display  and  looking  outside.  It 
was  brought  up  by  Dr.  Fox  and  others  this  morning  that  we  have  to 
look  at  differences  between  individuals,  whether  we  are  talking 
about  using  a  display  for  a  controller  or  a  pilot.  The  screening 
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and  the  training  of  individuals  that  I  have  heard  people  talk  about 
is  very  important  also. 

As  far  as  airborne  cockpit  applications  I  can  see  in  the  future, 
final  approach  guidance  for  landing  is  going  to  be  made  possible  in 
a  practical  sense  in  the  near  future.  I  think  we  will  have  the 
sensors  to  do  this  because  of  the  new  microwave  landing  systems  with 
precision  distance  information.  There  has  been  some  work  done  on 
this  already  by  sane  of  you  here  and  others  who  are  making  these 
types  of  displays  using  at  least  a  2-D  projection  of  3-D 
information.  Personally,  I  would  be  very  interested  to  see  if  we 
can  compare  a  2-D  projection  of  3-D  information  with  true  3-D 
information  and  see  if  there  is  any  difference  in  capability  between 
them.  This  is  a  possible  area  of  application.  It  may  be  necessary 
to  enlarge  or  distort  the  vertical  dimension  in  this  case  for  final 
approach  guidance  because  you  typically  have  dimensions 
longitudinally  of  perhaps  5  miles,  vertically  1500  feet,  and 
laterally  200  feet.  There  are  order  of  magnitude  differences  here 
that  may  need  some  enlarging  to  give  useful  visual  cues  to  the 
pilot.  An  extension  beyond  final  approach  guidance  would  be 
vertical  guidance,  metering,  and  spacing.  Standard  arrival  routings 
that  we  fly  now  require  both  course  and  altitude  guidance,  so  there 
is  a  three  dimensional  problem.  This  might  be  something  we  can  use 
in  the  cockpit.  If  we  go  to  metered  arrivals,  this  casts  time  as  a 
true  fourth  dimension.  Some  approaches  have  been  tried  by  the 
National  Aeronautics  and  Space  Administration;  for  example,  having  a 
moving  box  along  a  flat  projected  3-D  display,  but  there  may  be,  in 
fact,  a  need  for  true  3-D  displays.  Cockpit  display  of  other 
traffic  for  spacing  purposes  is  being  pursued  by  NASA  and  FAA,  and 
this  could  vein,'  well  use  a  3-D  type  of  display.  Collision  avoidance 
is  something  FAA  is  actively  pursuing,  and  my  branch  is  looking  at 
various  display  mediums  and  techniques  for  collision  avoidance. 
Perhaps  there  is  a  use  for  3-D  displays  in  this  area.  If  we  have 
what  we  are  calling  a  full-capability  collision  avoidance  system 
that  has  3-D  information,  that  is,  angular  information  as  well  as 
range  and  altitude  difference,  perhaps  we  could  feed  the  3-D 
display.  We  might  need  thai.  for  a  proximity  warning.  If  another 
aircraft  is  close  enough  that  I  need  to  maneuver,  it  would  be  good 
for  me  to  have  this  3-D  information.  If  I  only  need  to  know  that  he 
is  somewhere  in  the  vicinity,  it  may  not  be  worthwhile  going  to  that 
extra  complexity.  Another  area  of  possible  application,  a  little 
further  out  again  because  of  sensor ;,  would  be  the  display  of 
atmospheric  anomalies  in  the  cockpit.  For  example,  I  have 
responsibility  for  looking  at  wa.<e  vortices,  the  turbulent  air 
following  behind  aircraft.  We  have  to  do  seme  extra  spacing  between 
aircraft  because  these  vortices  are  out  there.  If  there  could  be 
developed  a  sensor  to  track  those  and  a  display  to  present  them  to 
the  pilot,  we  might  be  able  to  space  aircraft  a  little  closer 
together.  I  think  a  lot  of  us  airborne  would  like  to  see 
thunderstorms  3-dimensionally.  If  I  had  a  way  of  looking  at  a  cell 
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and  seeing  how  deep  and  how  wide  it  was,  that  would  be  very  useful 
to  me.  Another  aircraft  oriented  application,  but  not  strictly  in 
the  cockpit,  would  be  the  enhancement  of  simulation  techniques  and 
aircraft  simulators.  We  are  now  in  a  situation  of  having  what  are 
called  Phase  II  simulators  that  don't  have  a  3-D  capability,  and  yet 
we  have  shown  we  can  safely  progess  a  pilot  through  to  a  final  check 
in  them.  It  is  possible  that  the  first  time  he  would  see  an 
airplane  would  be  to  go  out  and  fly  people  in  it.  Simulators  are 
that  good  right  now.  Is  there  anything  to  be  gained  by  adding  three- 
dimensional  displays  to  a  simulator?  I  don't  know  but  it  is  a 
possibility.  Gertainly  since  we  are  using  computer  graphics  in  a 
lot  of  simulator  displays  right  now,  the  information  would  be  there. 

Now  to  go  to  the  other  general  area  of  ground  side  air  traffic 
control.  The  comments  that  I  have  here  are  gleaned  from  other 
people  within  PAA  that  I  have  worked  with.  I  know  they  run  counter 
to  the  feelings  of  some  of  you  here,  and  I  expect  to  hear  some 
questions  on  these  later.  In  general,  there  is  not  a  positive 
attitude  towards  the  use  of  3-D  displays  at  this  time  for  air 
traffic  control.  There  are  several  reasons,  and  I  will  try  to 
explain  than.  Attempts  have  been  made  in  the  past  with  some  type  of 
a  volumetric  three-dimensional  display.  There  were  several 
questions  which  came  up  during  testing  and  it  was  felt  that  this  was 
not,  at  least  at  that  time,  a  feasible  way  to  go.  The  consensus  was 
that  the  accuracy  of  tabular  data  was  needed  for  the  responsibility 
of  the  air  traffic  controller.  For  him  to  grant  separations,  issue 
clearances,  and  authorize  descents,  he  needs  to  have  the  accuracy 
that  he  gets  from  actually  reading  altitude  on  the  plan  view  that  he 
uses  right  now.  If  he  still  has  to  have  that,  there  is  not  much 
sense  in  going  to  a  3-D  display.  I  think  that  this  has  been  one  of 
the  major  problems  —  this  idea  of  precision  requirements  which 
controllers  feel  are  too  tight  to  allow  human  perceptual 
capabilities  to  give  them  that  data. 

The  second  problem  is  one  of  scaling.  The  fact  is  that  the 
typical  controller  may  have  a  4000-foot  slice  of  altitude  with  a 
20-,  30-,  40-,  or  even  50-mile  radius  of  responsibility  so  you  have 
quite  a  disparity  among  the  three  dimensions.  In  this  case,  a 
volumetric  display  would  not  help  him  that  much  as  a  3-D  cue. 

Perhaps  on  final  approach  that  might  be  a  little  different,  but  a 
large  variance  in  scaling  still  exists  as  I  have  indicated.  As  I 
mentioned  earlier,  Mr.  Helms  has  just  briefed  how  the  future  ATC 
system  may  look,  and  it  is  going  to  be  moving  toward  automation.  As 
you  know,  we  don't  have  as  many  controllers  as  we  did  a  year  ago, 

we  are  likely  to  have  a  reduced  number  for  some  time.  The 
.ovjnent  was  already  afoot  toward  more  automation  even  before  the 
current  situation.  We  are  going  to  move  that  way,  and  as  the 
controller  becomes  more  of  a  supervisor,  I  think  3-D  displays  will 
have  the  capability  of  allowing  him  to  visualize  the  total  traffic 
flow  while  allowing  him  to  concentrate  his  effort  as  a  human  monitor 
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on  intervening  in  the  system  in  a  particular  geographic  area.  We 
are  talking  15-20  years  for  this.  It  is  a  very  evolutionary 
system.  I  think  a  lot  of  the  applications  I  mentioned  in  tie 
cockpit  could  very  well  have  controller  applications  also  — 
collision  avoidance,  final  approach  guidance,  weather  display, 
metering  and  spacing.  All  could  apply  to  a  controller,  at  least  a 
terminal  area  controller. 

I  \/ould  say,  in  conclusion,  tliat  going  to  3-D  displays  must  be 
in  reponse  to  some  sort  of  a  validated  need  or  some  perceived 
capability  that  is  available  by  going  to  3-D  displays.  Practically, 
you  have  to  consider  the  sensor,  physical  size,  weight,  and 
procedures  that  you  are  going  to  follow.  There  are  several 
potential  aircraft  applications  that  I  mentioned  —  approach 
guidance,  vertical  guidance,  collision  avoidance,  weather  display, 
and  simulation  technology.  The  groundside  applications  may  not  be 
as  immediate  because  of  the  accuracy  requirements  for  the  granting 
of  clearances  and  the  fact  that  tlie  controller  has  many  targets,  but 
these  applications  may  increase  as  the  controller  becomes  more  of  a 
monitor. 
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APPLICABILITY  OF  THREE-DIMENSIONAL 
DISPLAY  RESEARCH  TO  MILITARY  OPERATIONAL  NEEDS 


John  J.  Pennella 

Naval  Explosive  Ordnance  Disposal  Technology  Center 
Indian  Head,  Maryland  20640 


My  name  is  John  Pennella  and  I  am  with  the  Naval  Explosive  Ordnance 
Disposal  Technology  Center.  The  Explosive  Ordnance  Disposal  Technology 
Center  is  a  relatively  small  activity  located  about  30  miles  south  of 
Washington  in  Indian  Head,  Maryland.  Our  activity  is  a  joint  service 
center.  That  means  that  we  vO  work  for  all  four  services  under  the 
Administrative  Management  of  the  Naval  Sea  Systems  Command.  Our  basic 
mission  is  in  developing  tools,  equipment,  and  techniques  for  the  mili¬ 
tary  EOD  technician.  These  tools  and  equipment  are  utilised  by  the  EOD 
technician  when  performing  their  functions  in  disarming  hazardous  ord¬ 
nance.  The  basic  tasks  required  by  the  EOD  technician  are  detection, 
location,  gaining  access  to,  final  identification  of,  and  lastly,  but 
definitely  most  Importantly,  is  neutralization  of  the  hazardous  item. 

Currently,  we  are  investigating  a  myriade  of  ways  of  reducing  and, 
hopefully  at  some  point,  eliminating  the  hazard  to  the  EOD  technician 
when  performing  these  tasks.  In  this  regard,  the  EOD  Technology  Cencer 
is  pursuing  small,  relatively  simple,  remotely  controlled  vehicles  to 
aid  in  the  performance  of  a  variety  of  these  hazardous  tasks.  These 
tasks  include  the  underwater  and  surface  detection,  location,  ident¬ 
ification.  final  placement  of  tools  on  or  near  the  hazardous  items,  and 
remote  recovery  and  removal  of  the  hazardous  item  to  a  disposal  area. 

I  have  been  asked  to  comment  on  the  applicability  of  the  topics 
discussed  today  to  the  problem  faced  by  the  EOD  technician.  As  a 
general  overall  comment,  I  see  two  areas  where  three-dimensional  dis¬ 
plays  would  assist  the  EOD  technician  in  the  performance  of  his  tasks: 

(1)  Placement  of  tools  on  or  near  a  hazardous  item  is  aided  when 
the  operator  of  a  remotely  controlled  vehicle  has  depth  perception. 

(2)  Scene  interpretation  is  greatly  aided  by  the  added  third- 
dimension. 

A  number  of  the  topics  discussed  today  need,  in  my  opinion, 
further  investigation.  Of  these  the  effects  of  training  very 


aa  june-mot  num 


177 


Pennella 


Important.  Earlier,  the  fact  that  training  effects  have  an  impact  on 
the  ability  of  the  operator  to  perform  his  tasks  were  discussed.  EOD 
technicians  are  highly  trained  specialists  in  rendering  ordnance  safe, 
however,  they  have  very  little  training  in  the  use  of  exotic  equipment. 
They  are  trained  on  specific  equipment  once  and  then  get  periodic  on- 
the-job  training.  They  may  not  use  that  specific  equipment  again  for 
six  months  to  a  year,  but  then  are  called  upon  to  use  it  at  a  moments 
notice.  In  this  regard,  does  the  use  of  three-dimensional  displays 
make  it  easier  to  train  the  equipment  operator?  Is  re-training  accom¬ 
plished  more  efficiently,  and  does  the  operator  perform  more  consist¬ 
ently? 

Another  topic  that  was  applicable  to  EOD  systems  is  the  need  to 
define  minimal  system  requirements  to  adequately  perform  the  tastes 
required  by  EOD  technicians.  For  example,  are  the  minimal  three- 
dimensional  system  requirements  different  from  requirements  for  identi¬ 
fication  and  detection  for  tool  placement?  How  does  the  system  design¬ 
er  determine  what  those  minimal  three-dimensional  system  requirements 
are? 


The  topic  of  scene  interpretation  is  highly  applicable  to  the  EOD 
technician's  task:,  during  training.  The  technician  has  a  known,  well 
defined  scene  he  is  required  to  interpret.  During  an  actual  incident, 
however,  a  very  unknown  scene  may  and  often  is  presented  to  the  opera¬ 
tor;  Yet  the  operator  is  required  to  search,  locate,  and  finally  disarm 
the  item.  The  first  problem  the  operator  may  encounter  is  the  detec¬ 
tion  of  the  hazardous  it  on  from  the  remainder  of  the  unknown  scene. 
Scene  interpretation  is  an  important  research  area  which  needs  further 
invest igat ion . 

Operator  fatigue,  especially  with  minimally  trained  personnel  is 
another  topic  that  requires  further  investigation.  Do  three-dimension¬ 
al  displays,  or  three-dimensional  video  presentations  decrease  or 
increase  operator  fatigue? 

One  of  the  topics  discussed  which  is  of  importance  in  the  EOD 
task  is  the  effects  of  three-dimensional  displays  on  operator-manipula¬ 
tor  performance  on  degraded  visual  conditions,  such  as  highly  turbid 
water.  Can  the  system  designer  expect  better  pei  romance  from  the 
operators  when  utilizing  three-dimensional  versus  two-dimensional 
systems? 

In  conclusion,  three-dimensional  video  displays  appear  to  solve 
many  of  the  operational  problems  and  limitations  associated  with  two- 
dimensional  video  systems.  I  believe  that  in  the  near  future  the 
applicability  and  utility  of  three-dimensional  video  displays  will  be 
demonst rated. 
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